240 likes | 387 Views
Chapter 2. Database System Architecture. 2.1. Data Models, Schemas and Instances. Data Model : A set of concepts to describe the structure of a database, and certain constraints that the database should obey.
E N D
Chapter 2 Database System Architecture
2.1. Data Models, Schemas and Instances • Data Model: A set of concepts to describe the structure of a database,and certain constraints that the database should obey. • Database Schema: The description of a database. Includes descriptions of the database structure and the constraints that should hold on the database. • Schema Diagram: A diagrammatic display of (some aspects of) a database schema. • Schema Construct: A component of the schema or an object within the schema, e.g., STUDENT, COURSE. • Database Instance: The actual data stored in a database at a particular moment in time. Also called database state (or occurrence).
Database Schema Vs. Database State • Database State: Refers to the content of a database at a moment in time. • Initial Database State: Refers to the database when it is loaded • Valid State: A state that satisfies the structure and constraints of the database. • Distinction • The database schema changes very infrequently. The database state changes every time the database is updated. • Schema is also called intension, whereas state is called extension.
2.2. Overview of Data Models • The main purpose of Data Model is to represent the data in an understandable way
Categories of data models • Conceptual (high-level, semantic) data models: Provide concepts that are close to the way many users perceive data. (Also called entity-based or object-based data models.) • Physical (low-level, internal) data models: Provide concepts that describe details of how data is stored in the computer. • Implementation (representational) data models: Provide concepts that fall between the above two, balancing user views with some computer storage details.
Categories of data models … • Representational data models represent data by using record structures and hence called record based data models • These include: hierarchical data model, network data model and relational data model • Hierarchical Model • The simplest data model • Record type is referred to as node or segment • The top node is the root node • Nodes are arranged in a hierarchical structure as sort of upside-down tree • A parent node can have more than one child node
Categories of data models … • A child node can only have one parent node • The relationship between parent and child is one-to-many • Relation is established by creating physical link between stored records (each is stored with a predefined access path to other records) • To add new record type or relationship, the database must be redefined and then stored in a new form.
Categories of data models … • ADVANTAGES of Hierarchical Data Model: • Hierarchical Model is simple to construct and operate on • Corresponds to a number of natural hierarchically organized domains - e.g., assemblies in manufacturing, personnel organization in companies • Language is simple; uses constructs like GET, GET UNIQUE, GET NEXT, GET NEXT WITHIN PARENT etc. • DISADVANTAGES of Hierarchical Data Model: • Navigational and procedural nature of processing • Database is visualized as a linear arrangement of records • Little scope for "query optimization"
Categories of data models … • Network Model • Allows record types to have more that one parent unlike hierarchical model • A network data models sees records as set members • Each set has an owner and one or more members • Allow no many to many relationship between entities • Like hierarchical model network model is a collection of physically linked records. • Allow member records to have more than one owner
Categories of data models … • ADVANTAGES of Network Data Model: • Network Model is able to model complex relationships and represents semantics of add/delete on the relationships. • Can handle most situations for modeling using record types and relationship types. • Language is navigational; uses constructs like FIND, FIND member, FIND owner, FIND NEXT within set, GET etc. Programmers can do optimal navigation through the database. • DISADVANTAGES of Network Data Model: • Navigational and procedural nature of processing • Database contains a complex array of pointers that thread through a set of records. • Little scope for automated "query optimization”
Categories of data models … • Relational Data Model • Developed by Dr. Edgar Frank Codd in 1970 (famous paper, 'A Relational Model for Large Shared Data Banks') • Terminologies originates from the branch of mathematics called set theory and relation • Can define more flexible and complex relationship • Viewed as a collection of tables called “Relations” equivalent to collection of record types • Relation: Two dimensional table • Stores information or data in the form of tables rows and columns • A row of the table is called tuple equivalent to record • A column of a table is called attribute equivalent to fields • Data value is the value of the Attribute • Records are related by the data stored jointly in the fields of records in two tables or files. The related tables contain information that creates the relation
Categories of data models … • The tables seem to be independent but are related some how. • No physical consideration of the storage is required by the user • Many tables are merged together to come up with a new virtual view of the relationship
Categories of data models … • The rows represent records (collections of information about separate items) • The columns represent fields (particular attributes of a record) • Conducts searches by using data in specified columns of one table to find additional data in another table • In conducting searches, a relational database matches information from a field in one table with information in a corresponding field of another table to produce a third table that combines requested data from both tables
2.3. Architecture and Data Independence • Defines DBMS schemas at three levels: • Internal schema at the internal level to describe physical storage structures and access paths. Typically uses a physical data model. • Conceptual schema at the conceptual level to describe the structure and constraints for the whole database for a community of users. Uses a conceptual or an implementation data model. • External schemas at the external level to describe the various user views. Usually uses the same data model as the conceptual level.
Architecture and Data Independence … Three-level ANSI-SPARC Architecture of a Database
Architecture and Data Independence … Mappings among schema levels are needed to transform requests and data. Programs refer to an external schema, and are mapped by the DBMS to the internal schema for execution.
Architecture and Data Independence … Data Independence • Logical Data Independence: The capacity to change the conceptual schema without having to change the external schemas and their application programs. • Physical Data Independence: The capacity to change the internal schema without having to change the conceptual schema.
Architecture and Data Independence … Data Independence When a schema at a lower level is changed, only the mappings between this schema and higher-level schemas need to be changed in a DBMS that fully supports data independence. The higher-level schemas themselves are unchanged. Hence, the application programs need not be changed since they refer to the external schemas.
2.4. Database Languages and Interfaces • A variety of users are supported by DBMS • The DBMS must provide appropriate languages and interfaces for each category of users Database Languages • Data Definition Language (DDL) • Allows DBA or user to describe and name entities, attributes and relationships required for the application. • Specification notation for defining the database schema
Database Languages and Interfaces … • Data Manipulation Language (DML) • Provides basic data manipulation operations on data held in the database. • Language for accessing and manipulating the data organized by the appropriate data model • DML also known as query language • There are two main types of DMLs • Procedural DML: user specifies what data is required and how toget the data. • Non-Procedural DML: user specifies what data is required but nothow it is to be retrieved
Database Languages and Interfaces … DBMS Interfaces • User friendly interfaces provided by DBMSs include: • Menu-Based Interfaces for Web Clients or Browsing • Forms-Based Interfaces • Graphical User Interfaces • Natural Language Interfaces • Speech Input and Output • Interfaces for Parametric Users. • Interfaces for the DBA
Properties of Relational Databases • Each row of a table is uniquely identified by a PRIMARY KEY composed of one or more columns • Each tuple in a relation must be unique • ENTITY INTEGRITY RULE of the model states that no component of the primary key may contain a NULL value. • A column or combination of columns that matches the primary key of another table is called a FOREIGN KEY used to cross-reference tables • REFERENTIAL INTEGRITY RULE of the model states that, for every foreign key value in a table there must be a corresponding primary key value in another table in the database or it should be NULL 22
All tables are LOGICAL ENTITIES • A table is either a BASE TABLE (Named Relation) or VIEW (Unnamed Relation) • Only Base Tables physically store data • VIEWS are derived from BASE TABLES with SQL instructions like: [SELECT .. FROM .. WHERE .. ORDER BY] • Order of rows and columns is immaterial • Entries with repeating groups are said to be un-normalized • Entries are single-valued • Each column (field or attribute) has a distinct name • All values in a column represent the same attribute and have the same data format 23
The building blocks of the relational data model are: • Entities: real world physical or logical object • Attributes: properties used to describe each entity or real world object. • Relationships: the associations between entities • Constraints: rules that should be obeyed while manipulating the data. 24