1 / 42

Data Models

Data Models. How to structure data. What is a Data Model?. Having formed a model of the enterprise, we now need to represent the data. The data model tells us the structure of the database. Historically, three data models: Hierarchical data model Network data model Relational data model.

Download Presentation

Data Models

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Models How to structure data

  2. What is a Data Model? • Having formed a model of the enterprise, we now need to represent the data. • The data model tells us the structure of the database. • Historically, three data models: • Hierarchical data model • Network data model • Relational data model

  3. Hierarchical and Network Data Models • Hierarchical and network data models have been superseded by the relational data model. • Reasons: • Lack of expressive power • E.g., one cannot express many-to-many relationships in the hierarchical model • More closely tied to the underlying implementation. Hence, less data independence. • Relational data model has a clean mathematical basis.

  4. The Relational Model • Due to Codd. • Everything is represented as a relation in the mathematical sense. Also called tables. • A database therefore is a collection of tables, each of which has a unique name, and each of which is described by a schema. • In addition, Codd defined a data manipulation language.

  5. Example of Schemas in the Relational Model • Example of a representation of entity sets: Student(sid,name,addr) Course(cid,title,eid) Empl(eid, ename, deptid) Dept(deptid, dname, loc) • Primary keys are underlined. • Recall that a primary key is one that uniquely identifies an entity. • An entity is a row in a table.

  6. More Example Schemas • Relationship sets between entity sets are also represented in tables. • Example of a table corresponding to a relationship: Enrol(sid, cid, grade) • Again, a relationship is represented by a row (or a tuple) in a relation.

  7. Relational Databases:Basic Concepts I • Attribute: • A column in a table • Domain • The set of values from which the values of an attribute are drawn. • Null value • A special value, meaning “not known” or “not applicable”. • Relation schema • A set of attribute names

  8. Relational Databases: Basic Concepts II • Tuple • A set of values, one for each attribute in the relation scheme over which the tuple is defined, i.e. a mapping from attributes to the appropriate domains • Relation instance • A set of tuples over the scheme of the relation

  9. Relational Databases:Basic Concepts III • Relational Database • A set of relations, each with a unique name • Normalized Relation • A relation in which every value is atomic (non-decomposable). Hence, every attribute in every tuple has a single value.

  10. Keys • Candidate Key • A minimal set of attributes that uniquely identifies a tuple • Primary Key • The candidate key chosen as the identifying key of the relation • Alternate Key • Candidate keys which are not primary keys

  11. Foreign Key • An attribute (or set of attributes) in table R1 which also occurs as the primary key of relation R2. • R2 is called the referenced relation. • Foreign keys are also called connection keys or reference attributes.

  12. Integrity Rules: Entity Constraint • Entity constraint • All attributes in a primary key must be non-null. • Motivation: If the primary key uniquely identifies an entity in an entity set, then we must ensure that we have all the relevant information

  13. Integrity Rules:Referential Integrity • Referential integrity • A database cannot contain a tuple with a value for a foreign key that does not match a primary key value in the referenced relation. • Or, a foreign key must refer to a tuple that exists. • Motivation: If referential integrity were violated, we could have relationships between entities that we do not have any information about.

  14. Data Manipulation Languages • In order for a database to be useful, it should be possible to store and retrieve information from it. This is the role of the data manipulation language. • One of the attractions of the relational data model is that it comes with a well-defined data manipulation language.

  15. Types of DML • Two types of data manipulation languages • Navigational (procedural) • The query specifies (to some extent) the strategy used to find the desired result e.g. relational algebra. • Non-navigational(non-procedural) • The query only specifies what data is wanted, not how to find it e.g. relational calculus.

  16. Relational Algebra • Codd defined a number of algebraic operations for the relational model. • Unary operations take as input a single table and produce as output another table. • Binary operations take as input two tables and produce as output another table.

  17. Unary Operations: Select • Select produces a table that only contains the tuples that satisfy a particular condition, in other words a “horizontal” subset. • Appearance: sC(R) • where C is a selection condition • and R is the relation over which the selection takes place

  18. Example of Select Student sid name addr 123 Fred 3 Oxford 345 John 6 Hope Rd. 567 Ann 5 Garden s sid > 300(Student) yields 345 John 6 Hope Rd. 567 Ann 5 Garden

  19. Unary Operations:Project • Project produces a table consisting of only some of the attributes. It creates a “vertical” subset. • Note that a project eliminates duplicates. • Appearance: ПA(R) • where A is a set of attributes of R • and R is the relation over which the project takes place.

  20. Example of Project Enrol sid cid grade 123 CS51T 76 234 CS52S 50 345 CS52S 55 Пcid(Enrol) yields CS51T CS52S

  21. Binary Operations • Two relations are (union) compatible if they have the same set of attributes. • Example, one table may represent suppliers in one country, while another table with same schema represents suppliers in another country. • For the union, intersection and set-difference operations, the relations must be compatible.

  22. Union, Intersection, Set-difference • R1 R2 • The union is the table comprised of all tuples in R1 or R2. • R1  R2 • The intersection is the table comprised of all tuples in R1 and R2 • R1 - R2 • The set-difference between R1 and R2 is the table consisting of all tuples in R1 but not in R2.

  23. Cartesian Product • R1  R2 • The Cartesian product is the table consisting of all tuples formed by concatenating each tuple in R1 with a tuple in R2, for all tuples in R2.

  24. Example of a Cartesian Product R1 A B 1 x 2 y R2 C D a s b t c u R1  R2 A B C D 1 x a s 1 x b t 1 x c u 2 y a s 2 y b t 2 y c u

  25. Natural Join • R1 R2 • Assume R1 and R2 have attributes A in common. Natural join is formed by concatenating all tuples from R1 and R2 with same values for A, and dropping the occurrences of A in R2 • R1 R2 = ПA’(sC(R1 R2)) • where C is the condition that the values for R1 and R2 are the same for all attributes in A and A’ is all attributes in R1 and R2 apart from the occurrences of A in R2. • hence, natural join is syntactic sugar

  26. Example of a Natural Join I Course cid title eid CS51T DBMS 123 CS52S OS 345 CS52T Networking 345 CS51S ES 456 Instructor eid ename 123 Rao 345 Allen 456 Mansingh

  27. Example of a Natural Join II Course Instructor cid title eid ename CS51T DBMS 123 Rao CS52S OS 345 Allen CS52T Net... 345 Allen CS51S ES 456 Mansingh

  28. Division • R1 R2 • Assume that the schema for R2 is a proper subset of the one for R1. • We form the division by • Ordering the tuples in R1 so that all the tuples with the same value for the non-common attributes are grouped together. • Each group contributes a tuple to the result if the group’s values on the common attributes form a superset of the values of these attributes in R2.

  29. Example of Division I Enrol cid sid grade CS51T 123 A CS52S 123 A CS51T 234 C CS52S 234 B CS51T 345 C CS52S 345 C Temp sid grade 123 A 234 B

  30. Example of Division II Enrol cid sid grade CS51T 123 A CS51T 234 C CS51T 345 C CS52S 123 A CS52S 234 B CS52S 345 C Enrol  Temp cid CS52S • Thus, the division gives all courses for which 123 got an A and 234 a B.

  31. Assignment • Allows the expression to be written in parts. • Assigns the part to a temporary variable. • This variable can be used in subsequent expressions. • E.g. • sid(title = ‘DBMS’ (Enrol Course) • Could be re-written as: • r Enrol Course • sid(title = ‘DBMS’(r))

  32. Rename Operation • Names the result of an expression. • x(A1,A2,…,An) (E) • returns the result of expression E under the name x with the attributes renamed as A1,A2,…,An. • E.g. S (Student) • Renames Student table to S.

  33. Database Modification • Insert • r r E • e.g. • Course Course {(‘CS51T’,’DBMS’)} • Delete • r r - E • e.g. • Student Student - sid=‘1’(Student) • Update • r F1,F2,…,Fn(r) • e.g. • Enrol sid,cid,grade grade + 2 (Enrol)  

  34. Examples • Assume the following schema: Student(sid,sname,saddr) Course(cid,title,lid) Enrol(sid, cid, grade) Lecturer(lid,lname,deptname) • Query 1: Find the name of all students that have taken the course entitled ‘Expert Systems’. • Query 2: Find the titles of all courses that student ‘Mark Smith’ has done. • Query 3: Find the id of students that have enrolled in all the courses that lecturer with id. = ‘234’ has taught. • Query 4: Find the highest grade for ‘CS51T’.

  35. Relational Calculus • A relational calculus expression defines a new relation in terms of other relations. • A tuple variable ranges over a named relation. So, its values are tuples from that relation. • Example: • Get grades for CS51T • e(Enrol) {<e.grade>: e.cid = ‘CS51T’ }

  36. Basic Syntax for Relational Calculus Expressions • r(R),…,s(S) { <target> : predicate} • where • R,..,S are tables • r,..,s are tuple variables • target specifies the attributes of the resulting relation • predicate is a formula giving a condition that tuples must satisfy to qualify for the resulting relation.

  37. The Predicate • Predicate is constructed from • attribute names • constants • comparison operators       • logical connectives      • quantified tuple variables t(R), t(R)

  38. Examples of Relational Calculus • Example 2 • Get names and grades for students enrolled in CS51T • e(Enrol), s(Student) {<s.name, e.grade>: e.cid = ‘CS51T’  s.sid = e.sid} • In relation algebra Пcid, name( s CID =‘ CS51T’(Grade Student))

  39. Example 3 • Give the names of all students who got at least one A. • s(Student) {<s.name>: e(Enrol) (e.grade = ‘A’  s.sid = e.sid)} • Tuple variables not mentioned in the target list must be bound in the predicate.

  40. Example 4 • Get the names of all students who only got A’s • s(Student) {<s.name>: e(Enrol)( s.sid = e.sid  e.grade = ‘A’)  e2(Enrol) (s.sid = e2.sid)}

  41. Example 5 • Get the names of all students who got an A and a B • s(Student) {<s.name>: e(Enrol) (e.grade = ‘B’  s.sid = e.sid)  e2(Enrol) (e2.grade = ‘A’  s.sid = e2.sid)}

  42. Example 6 • Get the course titles and names for the courses for which the student did not get an A • c(Course), s(Student) {<s.name, c.title>: g(Enrol) s.sid = g.sid  g.cid = c.cid  g.grade  ‘A’}

More Related