4.15k likes | 7.98k Views
Introduction to Database Management Systems (DBMS). Database Management System (DBMS). Definitions: Data: Known facts that can be recorded and that have implicit meaning Database: Collection of related data Ex. the names, telephone numbers and addresses of all the people you know
E N D
Introduction to • Database Management Systems • (DBMS)
Database Management System (DBMS) Definitions: • Data: Known facts that can be recorded and that have implicit meaning • Database: Collection of related data • Ex. the names, telephone numbers and addresses of all the people you know • Database Management System: A computerized record-keeping system
DBMS (Contd.) • Goals of a Database Management System: • To provide an efficient as well as convenient environment for accessing data in a database • Enforce information security: database security, concurrence control, crash recovery • It is a general purpose facility for: • Defining database • Constructing database • Manipulating database
Benefits of database approach • Redundancy can be reduced • Inconsistency can be avoided • Data can be shared • Standards can be enforced • Security restrictions can be applied • Integrity can be maintained • Data independence can be provided
DBMS Functions • Data Definition • Data Manipulation • Data Security and Integrity • Data Recovery and Concurrency • Data Dictionary • Performance
Users DATABASE SYSTEM Application Programs/Queries DBMS Software Software to process queries/programs Software to access stored data Stored Data Defn. Stored Database Database System (META-DATA).
user query Q1 Database scheme Application program query Q2 Query processor DDL compiler Compiled query Q2 Database description Database manager File manager Physical database Database System
Data Model • A set of concepts used to describe the structure of a database • By structure, we mean the data types, relationships, and constraints that should holds for the data Categories of Data Models Conceptual Physical Representational
External level(individual user views) Conceptual level(community user view) Internal level(storage view) Database Database Architecture
An example of the three levels SNo FName LName Age Salary BranchNo Conceptual View Internal View struct STAFF { int staffNo; int branchNo; char fName[15]; char lName[15]; struct date dateOfBirth; float salary; struct STAFF *next; /* pointer to next Staff record */ }; index staffNo; index branchNo; /* define indexes for staff */ FName Salary SNo LName Age External View1 LName SNo BranchNo External View2
Schema • Schema: Description of data in terms of a data model • Three-level DB Architecture defines following schemas: • External Schema (or sub-schema) • Written using external DDL • Conceptual Schema (or schema) • Written using conceptual DDL • Internal Schema • Written using internal DDL or storage structure definition
Data Independence • Change the schema at one level of a database system without a need to change the schema at the next higher level • Logical data independence: Refers to the immunity of the external schemas to changes in the conceptual schema e.g., add new record or field • Physical data independence: Refers to the immunity of the conceptual schema to changes in the internal schema e.g., adding new index should not void existing ones
COLUMN TABLE ROW VALUE TYPES OF DATABASE MODELS HIERARCHICAL NETWORK RELATIONAL
DATABASE DESIGN PHASES DATA ANALYSIS Entities - Attributes - Relationships - Integrity Rules LOGICAL DESIGN Tables - Columns - Primary Keys - Foreign Keys PHYSICAL DESIGN DDL for Tablespaces, Tables, Indexes
Definition : RDBMS • It is a system in which, at a minimum : • The data is perceived by the user as tables ( and nothing but tables ); and • The operators at the user’s disposal - e.g., for data retrieval - are operators that generate new tables from old, and those include at least SELECT, PROJECT, and JOIN.
Features of an RDBMS • The ability to create multiple relations (tables) and enter data into them • An interactive query language • Retrieval of information stored in more than one table • Provides a Catalog or Dictionary, which itself consists of tables ( called system tables )
Some Important Terms • Relation : a table • Tuple : a row in a table • Attribute : a Column in a table • Degree : number of attributes • Cardinality : number of tuples • Primary Key : a unique identifier for the table • Domain : a pool of values from which specific attributes of specific relations draw their values
Properties of Relations (Tables) • There are no duplicate rows (tuples) • Tuples are unordered, top to bottom • Attributes are unordered, left to right • All attribute values are atomic ( or scalar ) • Relational databases do not allow repeating groups
Keys • Key • Super Key • Candidate Keys • Primary Key • Alternate Key • Secondary Keys
sid cid grade sid name login age gpa 53666 carnatic101 C 53666 Jones Jones@cs 18 3.4 53688 reggae203 B 53688 Smith Smith@eecs 18 3.2 53650 topology112 A 53650 Smith Smith@math 19 3.8 53666 history105 B Keys and Referential Integrity Enrolled Student Primary key Foreign key referring to sid of STUDENT relation
Relational Query Languages • Query languages: Allow manipulation and retrieval of data from a database. • Relational model supports simple, powerful QLs: • Strong formal foundation based on logic. • Allows for much optimization. • Query Languages != programming languages!
sid bid day 22 101 10/10/99 58 103 11/12/99 sid sname rating age 22 Deepa 7 45.0 31 Laxmi 8 55.5 58 Roopa 10 35.0 Example Instances R1 S1 sid sname rating age 28 Yamuna 9 35.0 S2 31 Laxmi 8 55.5 44 Geeta 5 35.0 58 Roopa 10 35.0
Relational Algebra • Basic operations: • Selection ( ) • Projection () • Cross- product ( ) • Set- difference ( –) • Union ( )
sname rating Yamuna 9 Laxmi 8 Geeta 5 Roopa 10 Projection sname, rating(S2) age 35.0 age(S2) 55.5
Selection sid sname rating age rating > 8(S2) 28 Yamuna 9 35.0 58 Roopa 10 35.0 sname rating Yamuna 9 sname, rating(S2) (rating > 8(S2)) Roopa 10
sid sname rating age 22 Deepa 7 45.0 sid sname rating age 31 Laxmi 8 55.5 58 Roopa 10 35.0 Union, Intersection, Set Difference sid sname rating age S1 S2 22 Deepa 7 45.0 31 Laxmi 8 55.5 58 Roopa 10 35.0 44 Geeta 5 35.0 28 Yamuna 9 35.0 S1 S2 S1 S2
Cross- Product (sid) sname rating age (sid) bid day 22 Deepa 7 45.0 22 101 10/10/99 22 Deepa 7 45.0 58 103 11/12/99 31 Laxmi 8 55.5 22 101 10/10/99 31 Laxmi 8 55.5 58 103 11/12/99 22 101 10/10/99 58 Roopa 10 35.0 58 Roopa 10 35.0 58 103 11/12/99
(sid) sname rating age (sid) bid day 22 Deepa 7 45.0 22 101 10/10/99 31 Laxmi 8 55.5 58 103 11/12/99 Joins Condition Join :
Equi-Join (sid) sname rating age bid day 22 Deepa 7 45.0 101 10/10/99 58 Roopa 10 35.0 103 11/12/99
sno pno pno pno pno s1 p1 p2 p2 p1 s1 p2 p4 p2 s1 p3 p4 sno s1 p4 s1 s2 p1 sno s2 s2 p2 sno s1 s3 s3 p2 s1 s4 s4 s4 p2 s4 p4 Division • Not supported as a primitive operator, but useful for expressing queries like: • Find sailors who have reserved all boats . A B1 B2 B3 A/B1 A/B2 A/B3
Query in a high level language SCANING, PARSING AND VALIDATING Intermediate form of query QUERY OPTIMIZER Execution plan QUERY CODE GENERATOR Code to execute the query RUNTIME DATABASE PROCESSOR Result of query Processing A High-level Query Typical steps when processing a high level query.
Two Main Techniques for QueryOptimization • Heuristic Rules: A heuristic is a rule that works well in most of cases, but not always. General Idea: • Many different relational algebra expressions (and thus query trees) are equivalent. • Transform the initial query tree of a query into an equivalent final query tree that is efficient to execute. • Cost based query optimization • Estimate the cost for each execution plan, and choose the one with the lowest cost. • Can we get the best execution plan?
Motivating Example select * from R1, R2, R3 where R1.r2no=R2.r2no and R2.r3no=R3.r3no and R1.a=5000 NLJ NLJ SS(R1, “a=5000”) SS(R2) SS(R3)
Alternative Plans 1(No Indexes) select * from R1, R2, R3 where R1.r2no=R2.r2no and R2.r3no=R3.r3no and R1.a=5000 NLJ NLJ SS(R3) SS(R1, “a=5000”) SS(R2)
Alternative Plans 2 (With Indexes) select * from R1, R2, R3 where R1.r2no=R2.r2no and R2.r3no=R3.r3no and R1.a=5000 NLJ NLJ SS(R3) IS(R1, “a=5000”) SS(R2)
Overview of Database Design • Conceptual design : (ER Model is used at this stage.) • Schema Refinement : (Normalization) • Physical Database Design and Tuning
Predicate Calculus Domain Calculus SQL / Tuple Based Query By Examples E R Modeling • Conceptual Schema Design • Relational Calculus - Formal Language for Relational D/B. Relational Calculus
Design Phases… Requirements Collection& Analysis Data Requirements Functional Requirements Conceptual Design User Defined Operations Data Flow DiagramsSequence Diagrams, Scenarios Entity Types, Constraints , RelationshipsNo Implementation Details. Ensures Requirements Meets the Design Logical Design Data Model Mapping – Type of Database is identified Physical Design Internal Storage Structures / Access Path / File Organizations
E-R Modeling • Entity • is anything that exists and is distinguishable • Entity Set • a group of similar entities • Attribute • properties that describe an entity • Relationship • an association between entities
Notations ENTITY TYPE ( REGULAR ) WEAK ENTITY TYPE RELATIONSHIP TYPE WEAK RELATIONSHIP TYPE
SSN NAME LOT 123- 22- 3666 Attishoo 48 231- 31- 5368 Smiley 22 131- 24- 3650 Smethurst 35 Entity Attributes ssn name lot Employee Entity Set CREATE TABLE Employees (ssn CHAR (11), name CHAR (20), lot INTEGER, PRIMARY KEY (ssn))
ER Model since ssn name lot budget did dname Works_in Employee Department supervisor Sub- ordinate Reports_To
SSN DID SINCE 123-22-3666 51 1/1/91 123-22-3666 56 3/3/93 231-31-5368 51 2/2/92 ER Model (Contd.) Works_ In CREATE TABLE Works_ In( ssn CHAR (11), did INTEGER, since DATE, PRIMARY KEY (ssn, did), FOREIGN KEY (ssn) REFERENCES Employees, FOREIGN KEY (did) REFERENCES Departments)
since ssn name lot budget did dname Employee Department Manages Key Constraints
Key Constraints for Ternary Relationships since ssn name lot did dname budget Works_in Employee Department Location capacity address