830 likes | 838 Views
This lecture provides an overview of database management, data organization, data models, schema, and instance. It also covers data retrieval, query processing, indexing, data integrity, concurrency control, and recovery management. Includes exercises and examples.
E N D
COP5725:Principles of Database Management Systems Fall 2011Lecture 2: Introduction to Database Management and Database Design
Homework I • Exercise 1.4, 1.6 and 1.9
Last Class • File Processing and Database • An overview of Database Management • Data Organization • Data Models, relation models • Schema, Instance • Data Independence • Three level of abstraction (External View, Logical Schema and Physical Schema) • Logical Dependence / Physical Dependence • Data Retrieval • Query Language • Query Processing and Optimization • Indexing • Data Integrity • Concurrency Control, Recovery Management • Transaction • Entry Error • Integrity Constraints
Internal view ANSI/SPARC Architecture ASSIGNMENT(ENO,PNO,ENAME,PNAME) Users External Schema External view External view External view EMP(ENO: string, ENAME: string, TITLE: string) PROJ(PNO: string, PNAME: string, BUDGET: integer) WORKS(ENO: string, PNO: string, RESP: string, DUR: integer) Conceptual Schema Conceptual view Internal Schema Store all the relations as unsorted files. Build indexes on EMP(ENO), PROJ(PNO) and WORKS(ENO,PNO). DBMS
Crash Recovery: Ensuring Atomicity • DBMS ensures atomicity(all-or-nothing property) even if system crashes in the middle of an transaction. • Idea: Keep a log(history) of all actions carried out by the DBMS while executing a set of transactions: • Before a change is made to the database, the corresponding log entry is forced to a safe location. (WAL protocol; OS support for this is often inadequate.) • After a crash, the effects of partially executed transactions are undone using the log. (Thanks to WAL, if log entry wasn’t saved before the crash, corresponding change was not applied to database!)
The Log • The following actions are recorded in the log: • Ti writes an object: The old value and the new value. • Log record must go to disk beforethe changed page! • Ti commits/aborts: A log record indicating this action. • Log records chained together by transaction id, so it’s easy to undo a specific transaction (e.g., to resolve a deadlock). • Log is often duplexed and archived on “stable” storage. • All log related activities (and in fact, all CC related activities such as lock/unlock, dealing with deadlocks etc.) are handled transparently by the DBMS.
Transaction Manager Recovery Manager Integrity Constraints An Overview of Data OrganizationComponents of a DBMS DBProgrammer DBA User DDL Commands Query Code w/ embedded queries Query Processor Query Optimizer DML Precompiler DDL Interpreter Query Evaluator Storage Manager File Manager Buffer Manager Secondary Storage Metadata Indices Data Statistics Schema
1. Data Organization • Logical: Data Models • Physical: Data Storage 2. Data Retrieval • Queries = Declarative “Retrieval Programs” • Easy-to-write (compared to standard PL’s) • Efficient execution (w/ query processing) 3. Data Integrity • Managing Concurrency (Transactions) • Managing Crashes (Recovery) • Managing Semantic Inconsistencies (Integrity Constraints) SummaryWhat Gets Managed in a DBMS?
In A Sentence • How to manage large amounts of persistent, and structureddata that are shared among distributed users and processes and whose integrity must be maintained and whose security must be controlled?
DataBase Management System • Examples of Relational DBMS: • Access, Paradox, dBase, FoxPro, Clipper • SQL Server, MySQL • DB2, Oracle, Sybase, Informix
DBMS Languages • Data Definition Language (DDL) • Defines conceptual schema, external schema, and internal schema, as well as mappings between them • Language used for each level may be different • The definitions and the generated information is stored in system catalog • Data Manipulation Language (DML) • Can be • embedded query language in a host language • “stand-alone” query language • Can be • Procedural: specify where and how (navigational) • Declarative: specify what
Database Users • End user • Naïve or casual user • Accesses database either through forms or through application front-ends • More sophisticated ones generate ad hoc queries using DML • Application programmer/developer • Designs and implements applications that access the database (some may be used by end users) • Database administrator (DBA) • Defines and manages the conceptual schema • Defines application and user views • Monitors and tunes DBMS performance (defines/modifies internal schema) • Loads and reformats the database • Responsible for security and reliability Must understand how a DBMS works!
Different users has different perspectives E.g. The programmer sees SQL, which has two components: • Data Definition Language - DDL • Data Manipulation Language - DML • query language Behind the scenes the DBMS has: • Query engine • Query optimizer • Storage management • Transaction Management (concurrency, recovery)
How the Programmer Sees the DBMS • Start with DDL to create tables: • Continue with DML to populate tables: CREATE TABLE Students ( Name CHAR(30) SSN CHAR(9) PRIMARY KEY NOT NULL, Category CHAR(20) ) . . . INSERT INTO Students VALUES(‘Charles’, ‘123456789’, ‘undergraduate’) . . . .
Database Application • A database application is a collection of data and the programs that allow the manipulation of these data • A database application is usually implemented using a DataBase Management System (DBMS)
Database Application: Examples • ATM banking • University data managemen (Oracle) • Airline reservations (Amadeus, Sabre) • My address book (Microsoft Access) • The e-shop around the corner (MSQL)
Place of DBMS in a Computer System Application Programs App. development tools DBMS Operating System (?) Hardware
Summary On Introduction • DBMS used to maintain, query large datasets. • Benefits include recovery from system crashes, concurrent access, quick application development, data integrity and security. • Levels of abstraction give data independence. • A DBMS typically has a layered architecture. • DBAs hold responsible jobs and are well-paid! • Database Job Search Engine (www.databasejobs.com) • DBMS R&D is one of the broadest,most exciting areas in CS.
Database Design • Understand the real-world domain being modeled • Specify it using a database design model • Design models are especially convenient for schema design, but are not necessarily implemented by DBMS • Popular ones include • Entity-Relationship (ER) data model • Fairly mechanical ways to convert design models to the real implementations e.g., relation models
Overview of Database Design • Conceptual design: (ER Model is used at this stage.) • What are the entities and relationships in the enterprise? • What information about these entities and relationships should we store in the database? • What are the integrity constraints or business rules that hold? • A database `schema’ in the ER Model can be represented pictorially (ER diagrams). • Can map an ER diagram into a relational schema.
Outline • ER model Basics • Additional Features of ER Models • Key Constraints • Participation Constraints • Weak Entities • Class Hierarchies • Aggregation • Conceptual Design using ER Models
ER Model Basics: Entity • Entity: Real-world object distinguishable from other objects. An entity is described (in DB) using a set of attributes. • Entity Set: A collection of similar entities. E.g., all employees. • All entities in an entity set have the same set of attributes. (Until we consider ISA hierarchies, anyway!)
name ssn lot Employees ER Diagrams • In an entity-relationship diagram, each entity set is represented by a rectangle. • Each attribute of an entity set is represented by an oval, with a line to the rectangle representing its entity set.
Keys • A key is a minimal set of attributes for one entity set such that no two entities in this set agree on all the attributes of the key. • It is allowed for two entities to agree on some, but not all, of the key attributes. • We must designate a key for every entity set. • Underline the key attribute(s).
ER Model Basics: Relationships • Relationship: Association among two or more entities. E.g., Tom works in Pharmacy department. • Relationship Set: Collection of similar relationships. • It is represented by a diamond, with lines to each of the entity sets involved.
Example name dname ssn budget lot did Works_In Employees Departments
name manf Beers Sells Bars sell some beers. Drinkers like some beers. Visits Likes Drinkers visit some bars. Drinkers name addr Another Example name addr Bars
Example • For the relationship Sells, we might have a relationship set like: Bar Beer Joe’s Bar Bud Joe’s Bar Miller Sue’s Bar Bud Sue’s Bar Pete’s Ale Sue’s Bar Bud Lite
Multiway Relationships • Sometimes, we need a relationship that connects more than two entity sets. • Suppose that drinkers will only drink certain beers at certain bars. • Our three binary relationships Likes, Sells, and Frequents do not allow us to make this distinction. • But a 3-way relationship would.
Example name addr name manf Bars Beers Preferences Drinkers name addr
A Typical Relationship Set Bar Drinker Beer Joe’s Bar Ann Miller Sue’s Bar Ann Bud Sue’s Bar Ann Pete’s Ale Joe’s Bar Bob Bud Joe’s Bar Bob Miller Joe’s Bar Cal Miller Sue’s Bar Cal Bud Lite
Attributes on Relationships • Sometimes it is useful to attach an attribute to a relationship. • Descriptive Attributes • Information about the relationship rather than information about any of the entities • Think of this attribute as a property of tuples in the relationship set.
Example Sells Bars Beers price Price is a function of both the bar and the beer, not of one alone.
Example since name dname ssn budget lot did Works_In Employees Departments
Roles • Sometimes an entity set appears more than once in a relationship. • The entity sets that participate in a relationship set need not to be distinct • Label the edges between the relationship and the entity set with names called roles.
Example name ssn lot Employees super-visor subor-dinate Reports_To
Outline • ER model Basics • Additional Features of ER Models • Key Constraints • Participation Constraints • Weak Entities • Class Hierarchies • Aggregation • Conceptual Design using ER Models
Many-Many Relationships name dname ssn budget lot did Works_In Employees Departments • In a many-many relationship, an entity of either set can be connected to many entities of the other set. • Consider Works_In: An employee can work in many departments; a dept can have many employees. • a bar sells many beers; a beer is sold by many bars.
since name dname ssn lot Employees Manages Key Constraints • In contrast, each dept has at most one manager, according to the key constrainton Manages. did budget Departments
Many-One Relationships • Each entity of the first set is connected to at most one entity of the second set. • But an entity of the second set can be connected to zero, one, or many entities of the first set. • Show a many-one relationship by an arrow from the “many” side. • Given a Departments entity, we can uniquely determine the Manages relationship in which it appears
since name dname ssn lot Employees Manages One-One Relationships • In a one-one relationship, each entity of either entity set is related to at most one entity of the other set. • Example: • Each employee can manage at most one department did budget Departments
In Pictures: many-many many-one one-one
Outline • ER model Basics • Additional Features of ER Models • Key Constraints • Participation Constraints • Weak Entities • Class Hierarchies • Aggregation • Conceptual Design using ER Models
Participation Constraints • Does every department have a manager? • If so, this is a participation constraint: the participation of Departments in Manages is said to be total (vs. partial). • Every department is required to have a manager • Each employee works in at least one department • Each department has at least one employee • If the participation of an entity set in a relationship set is total, the two are connected by a thick line.
Example since since name name dname dname ssn did did budget budget lot Departments Employees Manages Works_In since
Outline • ER model Basics • Additional Features of ER Models • Key Constraints • Participation Constraints • Weak Entities • Class Hierarchies • Aggregation • Conceptual Design using ER Models
Weak Entity Sets • Occasionally, entities of an entity set need “help” to identify them uniquely. • A weak entity can be identified uniquely only by considering another (owner) entity. • Owner entity set and weak entity set must participate in a one-to-many relationship set (one owner, many weak entities). • Weak entity set must have total participation in this identifying relationship set.
Example name cost pname age ssn lot Policy Dependents Employees
Example • name is almost a key for football players, but there might be two with the same name. • number is certainly not a key, since players on two teams could have the same number. • But number, together with the Team related to the player by Plays-on should be unique.