290 likes | 399 Views
Welcome to IS C332/IS F243: Database Systems and Applications. Today’s Class. Introduction overview of DBMS. Instances and Schemas. Each level is defined by a schema, which defines the data at the corresponding level
E N D
Welcome to IS C332/IS F243: Database Systems and Applications Today’s Class Introduction overview of DBMS
Instances and Schemas • Each level is defined by a schema, which defines the data at the corresponding level • A logical schema defines the logical structure of the database (e.g., set of customers and accounts and the relationship between them) • A physical schema defines the file formats and locations • A databaseinstance refers to the actual content of the database at a particular point in time. A database instance must conform to the corresponding schema
Schema diagram for UNIVERSITY database schema construct
2-4 UNIVERSITY Database Instance
Storage Management • A storage manager is a program module that provides the interface between the low-level data stored in the database and the application programs and queries submitted to the system. • The storage manager is responsible for the following tasks: • interaction with the file manager • efficient storing, retrieving, and updating of data.
Query Processing 1. Parsing and translation 2. Optimization 3. Evaluation
Query Processing (Cont.) • Alternative ways of evaluating a given query • Equivalent expressions • Different algorithms for each operation • Cost difference between a good and a bad way of evaluating a query can be enormous • Need to estimate the cost of operations • Depends critically on statistical information about relations which the database must maintain • Need to estimate statistics for intermediate results to compute cost of complex expressions
Transaction 1 Transaction 1 Transaction 2 Conflicting read/write Transaction Management • A transaction is a collection of operations that performs a single logical • function in database application time
Transaction Management (cont.) • Transaction-management component ensures that the database remains in a consistent (correct) state despite system failures (e.g. power failures and operating system crashes) and transaction failures. • Concurrency-control manager controls the interaction among the concurrent transactions, to ensure the consistency of the database.
Database Administrator (DBA) • Coordinates all the activities of the database system; the database administrator has good understanding of the enterprise’s information resources and needs. • Database administrator’s duties include: • Schema definition • Specifying integrity constraints • Storage structure and access method definition • Schema and physical organization modification • Granting user authority to access the database • Monitoring performance and responding to changes in requirements Primary job of a database designer More system oriented
Database Users • Users are differentiated by the way they expected to interact with the system • Application programmers • Develop applications that interact with DBMS through DML calls • Sophisticated users • form requests in a database query language • mostly one-time ad hoc queries • End users • invoke one of the existing application programs (e.g., print monthly sales report) • Interact with applications through GUI
Query Optimization and Execution Relational Operators Files and Access Methods Buffer Management Disk Space Management DB Structure of a DBMS These layers must consider concurrency control and recovery • A typical DBMS has a layered architecture. • The figure does not show the concurrency control and recovery components. • This is one of several possible architectures; each system has its own variations.
User / Application DB Administrator Transaction Commands DDL Commands Query Compiler Transaction Manager DDL Compiler Query plan Execution Engine Logging & Recovery Concurrency Control Index, file and record requests Meta data Statistics Meta data Index/file/record manager Log pages Lock Table Data, metadata, indexes Page Commands Buffer Manager BUFFERS Read/writepages Storage Manager Architecture of Modern DBMS
Application Architectures • Two-tier architecture: E.g. client programs using ODBC/JDBC to communicate with a database • Three-tier architecture: E.g. web-based applications, and applications built using “middleware”
Characteristics of a Modern DBMS • Data independence and efficient access. • Abstraction - hiding lower level details • Efficient data access • Indexing - Significant for very large databases • Data integrity and security • Application independent data integrity features • Simpler Access control mechanisms - Views • Uniform data administration. • Concurrent access, recovery from crashes. • Reduced application development time • Many important tasks are handled by DBMS
Summary • DBMS used to maintain, query large datasets. • Benefits include recovery from system crashes, concurrent access, quick application development, data integrity and security. • Levels of abstraction give data independence. • A DBMS typically has a layered architecture. • DBAs hold responsible jobs and are well-paid! • DBMS R&D is one of the broadest, most exciting areas in CS.
Data Models • A collection of tools for describing: • data • data relationships • data semantics • data constraints • Object-based logical models • entity-relationship model • object-oriented model • semantic model • functional model • Record-based logical models • relational model (e.g., SQL based ORACLE, DB2) • network model • hierarchical model (e.g., IMS)
Relational Model Concepts • Relational Model of data is based on the concept of RELATION • A Relation is a Mathematical concept based on idea of SETS • The strength of the relational approach to data management comes from the formal foundation provided by the theory of relations
Relational Model Concepts The model was first proposed by Dr. E.F. Codd of IBM in 1970 in the following paper:"A Relational Model for Large Shared Data Banks," Communications of the ACM, June 1970. The above paper caused a major revolution in the field of Database management and earned Codd the coveted ACM Turing Award in 1981
Data Models • Data Model: A set of concepts to describe the structure of a database,and certain constraints that the database should obey. • Data Model Operations: Operations for specifying database retrievals and updates by referring to the concepts of the data model. Operations on the data model may include basic operations and user-defined operations.
Categories of data models • Conceptual (high-level, semantic) data models: Provide concepts that are close to the way many users perceive data. (Also called entity-based or object-based data models.) • Physical (low-level, internal) data models: Provide concepts that describe details of how data is stored in the computer. • Implementation (representational) data models: Provide concepts that fall between the above two, balancing user views with some computer storage details.
Schemas versus Instances • Database Schema: The description of a database. Includes descriptions of the database structure and the constraints that should hold on the database. • Schema Diagram: A diagrammatic display of (some aspects of) a database schema. • Schema Construct: A component of the schema or an object within the schema, e.g., STUDENT, COURSE. • Database Instance: The actual data stored in a database at a particular moment in time. Also called database state (or occurrence).
Database Schema Vs. Database State • Database State: Refers to the content of a database at a moment in time. • Initial Database State: Refers to the database when it is loaded • Valid State: A state that satisfies the structure and constraints of the database. • Distinction • The database schema changes very infrequently. The database state changes every time the database is updated. • Schema is also called intension, whereas state is called extension.
define empty state load initial state update valid state satisfy database schema state update
Importance of Data Models • Data models • Representations, usually graphical, of complex real-world data structures • Facilitate interaction among the designer, the applications programmer and the end user • End-users have different views and needs for data • Data model organizes data for various users
Data Model Basic Building Blocks • Entity • Anything about which data will be collected/stored • Attribute • Characteristic of an entity • Relationship • Describes an association among entities • One-to-one (1:1) relationship • One-to-many (1:M) relationship • Many-to-many (M:N or M:M) relationship • Constraint • A restriction placed on the data
History of Data Models • Relational Model: proposed in 1970 by E.F. Codd (IBM), first commercial system in 1981-82. Now in several commercial products (DB2, ORACLE, SQL Server, SYBASE, INFORMIX). • Network Model: the first one to be implemented by Honeywell in 1964-65 (IDS System). Adopted heavily due to the support by CODASYL (CODASYL - DBTG report of 1971). Later implemented in a large variety of systems - IDMS (Cullinet - now CA), DMS 1100 (Unisys), IMAGE (H.P.), VAX -DBMS (Digital Equipment Corp.). • Hierarchical Data Model: implemented in a joint effort by IBM and North American Rockwell around 1965. Resulted in the IMS family of systems. The most popular model. Other system based on this model: System 2k (SAS inc.)
History of Data Models • Object-oriented Data Model(s): several models have been proposed for implementing in a database system. One set comprises models of persistent O-O Programming Languages such as C++ (e.g., in OBJECTSTORE or VERSANT), and Smalltalk (e.g., in GEMSTONE). Additionally, systems like O2, ORION (at MCC - then ITASCA), IRIS (at H.P.- used in Open OODB). • Object-Relational Models: Most Recent Trend. Started with Informix Universal Server. Exemplified in the latest versions of Oracle-10i, DB2, and SQL Server etc. systems.