Today’s Class

IS C332: Database Systems and Applications Today’s Class Data Models Relational Model

Importance of Data Models • Data models • Representations, usually graphical, of complex real-world data structures • Facilitate interaction among the designer, the applications programmer and the end user • End-users have different views and needs for data • Data model organizes data for various users • Good database design uses an appropriate data model as its foundation

Data Model : Basic Building Blocks • Entity is anything about which data are to be collected and stored • Attribute is a characteristic of an entity • Relationship describes an association among (two or more) entities • One-to-many (1:M) relationship • Many-to-many (M:N or M:M) relationship • One-to-one (1:1) relationship

Business Rules • Brief, precise, and unambiguous description of a policy, procedure, or principle within a specific organization’s environment • Apply to any organization that stores and uses data to generate information • Description of operations that help to create and enforce actions within that organization’s environment

Business Rules • Must be rendered in writing • Must be kept up to date • Sometimes are external to the organization • Must be easy to understand and widely disseminated • Describe characteristics of the data as viewed by the company

Sources of Business Rules • Company managers • Policy makers • Department managers • Written documentation • Procedures • Standards • Operations manuals • Direct interviews with end users

History of Data Models • Relational Model: proposed in 1970 by E.F. Codd (IBM), first commercial system in 1981-82. Now in several commercial products (DB2, ORACLE, SQL Server, SYBASE, INFORMIX). • Network Model: the first one to be implemented by Honeywell in 1964-65 (IDS System). Adopted heavily due to the support by CODASYL (CODASYL - DBTG report of 1971). Later implemented in a large variety of systems - IDMS (Cullinet - now CA), DMS 1100 (Unisys), IMAGE (H.P.), VAX -DBMS (Digital Equipment Corp.). • Hierarchical Data Model: implemented in a joint effort by IBM and North American Rockwell around 1965. Resulted in the IMS family of systems. The most popular model of its time. Other system based on this model: System 2k (SAS inc.)

History of Data Models • Object-oriented Data Model(s): several models have been proposed for implementing in a database system. One set comprises models of persistent O-O Programming Languages such as C++ (e.g., in OBJECTSTORE or VERSANT), and Smalltalk (e.g., in GEMSTONE). Additionally, systems like O2, ORION (at MCC - then ITASCA), IRIS (at H.P.- used in Open OODB). • Object-Relational Models: Most Recent Trend. Started with Informix Universal Server. Exemplified in the latest versions of Oracle-10i, DB2, and SQL Server etc. systems.

Hierarchical Database Model • Logically represented by an upside down tree • Each parent can have many children • Each child has only one parent

Hierarchical Database Model • Advantages • Conceptual simplicity • Database security and integrity • Data independence • Efficiency • Disadvantages • Complex implementation • Difficult to manage and lack of standards • Lacks structural independence • Applications programming and use complexity • Implementation limitations

Hierarchical Data Model • ADVANTAGES: • Hierarchical Model is simple to construct and operate on • Corresponds to a number of natural hierarchically organized domains - e.g., assemblies in manufacturing, personnel organization in companies • Language is simple; uses constructs like GET, GET UNIQUE, GET NEXT, GET NEXT WITHIN PARENT etc. • DISADVANTAGES: • Navigational and procedural nature of processing • Database is visualized as a linear arrangement of records • Little scope for "query optimization"

Network Database Model • Each record can have multiple parents • Composed of sets • Each set has owner record and member record • Member may have several owners

Network Database Model • Advantages • Conceptual simplicity • Handles more relationship types • Data access flexibility • Promotes database integrity • Data independence • Conformance to standards • Disadvantages • System complexity • Lack of structural independence

Network Data Model • ADVANTAGES: • Network Model is able to model complex relationships and represents semantics of add/delete on the relationships. • Can handle most situations for modeling using record types and relationship types. • Language is navigational; uses constructs like FIND, FIND member, FIND owner, FIND NEXT within set, GET etc. Programmers can do optimal navigation through the database. • DISADVANTAGES: • Navigational and procedural nature of processing • Database contains a complex array of pointers that thread through a set of records. • Little scope for automated "query optimization”

The Relational Model

Introduction • Proposed by Edgar. F. Codd(1923-2003) in the early seventies. [ Turing Award –1981 ] • Most of the modern DBMS are relational. • Simple and elegant model with a mathematical basis. • Led to the development of a theory of data dependencies and database design. • Relational algebra operations • crucial role in query optimization and execution. • Laid the foundation for the development of • Tuple relational calculus and then • Database standard SQL

Basic Concepts • Entities and relationships are stored in tables • Relationships are captured by including key of one table into another • Languages for manipulating the tables • All popular DBMSs today are based on relational data model (or an extension of it, e.g., object-relational data model)

Why is it so good? • Simplicity, everybody knows how to manipulate tables • Tables are simple enough so that solutions to complicated problems such as concurrency controland query optimization can be obtained • It has a theoretical basis for the studying of database design problems • Tables are logical concepts; physically tables can be stored in different ways  support data independence

Terminology • Relation table; denoted by R(A1, A2, ..., An) where R is a relation name and (A1, A2, ..., An) is the relation schema of R • Attribute  column; denoted by Ai • Tuple  row • Attribute value  value stored in a table cell • Domain  legal type and range of values of an attributedenoted by dom(Ai) • Attribute: Age Domain: [0-100] • Attribute: EmpName Domain: 50 alphabetic chars • Attribute: Salary Domain: non-negative integer • Ideally, a domain can be defined in terms of another domain; e.g., the domain of EmpName is PersonName. This is NOT allowed in most basic DBMSs. • However, most recent DBMSs allows this (object-relational) extension such as Oracle 10g.

Relational Database: Definitions • Relational database:a set of relations • Relation: made up of 2 parts: • Instance : a table, with rows and columns. #Rows = cardinality, #fields = degree / arity. • Schema :specifiesname of relation, plus name and type of each column. • e.g. Students(sid: string, name: string, login: string,age: integer, gpa: real). • Can think of a relation as a setof rows or tuples (i.e., all rows are distinct).

An Example Relation Relation Name/Table Name Attributes/Columns (collectively as a schema) Tuples/Rows • Cardinality = 5, degree = 4, all rows distinct

Another Relation Example enrollment (studentName, rollNumber, courseNo, sectionNo) enrollment

Some Formal Definitions • A relation is denoted by: R(A1, A2, ..., An) • STUDENT(Name, Student-id, Age, CGA) • Degree of a relation: the number of attributes n in the relation. • Tuple t of R(A1, A2, ..., An): An ordered set of values <v1,v2,...,vn> where each vi is an element of dom(Ai). • Relation instance r(R): A set of tuples in R r(R) = {t1, t2, ..., tm}, or alternatively r(R)  dom(A1)  dom(A2)  ...  dom(An)

Relation and Cartesian Product • A relation is any subset of the Cartesian product of domains of values • Example:Let Dom(Name) = { Lee, Cheung } Dom(Grade) = { A, B, C } Then the Cartesian product of the domains is Dom(Name)  Dom(Grade) = { Lee, A, Lee, B , Lee, C, Cheung, A, Cheung, B, Cheung C } • A relation StudentGrade (Name, Grade) can be defined as any subsetof the Cartesian product Dom(Name)  Dom(Grade) r(StudentGrade) = { Lee, A, Cheung C }  Dom(Name)  Dom(Grade)

Characteristics of Relations • Tuples in a relation are not considered to be ordered, even though they appear to be in a tabular form. (Recall that a relation is a set of tuples.) • Ordering of attributes in a relation schema R are significant. • Values in a tuple: All values are considered atomic. (Recall that a domain is a set of atomic values.) A special null value is used to represent values that are unknown or inapplicable to certain tuples.

Keys • Let K R (I.e., K is a set of attributes which is a subset of the schema of R) • K is a superkey of R if K can identify a unique tuple in a given relationr(R) Customer(CusNo, Name, Address, …)where customers have unique customer numbers and unique names.Possible superkeys: CusNo {CusNo, Name} {CusNo, Name, Address} plus many others • K is a candidate key if K is minimal There are two candidate keys: CusNo and Name • Every relation is guarantee to (must) have at least one key. Why?

Key(Candidate key) • A key can not be determined from any particular instance data • it is an intrinsic property of a scheme • it can only be determined from the meaning of attributes • A relation can have more than one key. • Superkey: A set of attributes that contains any key as a subset. • A key can also be defined as a minimal superkey • Primary Key: One of the candidate keys chosen for indexing purposes ( More details later…)

Relational Model • Sets • collections of items of the same type • no order • no duplicates • Mappings domain range 1:many many:1 1:1 many:many

Exercise • What are the mapping cardinalities of the following 4 relationships? B C D A

Schema Diagram for University Database

Today’s Class

Today’s Class

Presentation Transcript

MILITARY SYMBOLOGY

HP BladeSystem Virtual Connect

Phylum Mollusca

FOUNDATIONS OF ENTREPRENEURSHIP

WHAT DAY IS IT TODAY?

COMPLEX SENTENCE

SLE

THE PURPOSE FOR THIS CLASS

Hancock

Outline for today

Good morning,class!

CS 468: Advanced UNIX Class 6

CLASS OF 2016CLASS OF 2016*** WELCOME! PARENTS AND GUARDIANS OF CURIE'S FRESHMEN

Class 31

class notes; 4.26.11. Work and Horsepower NO Quiz TODAY

Analog Electronics Class Session 1: RC and RL Circuits (Into to Tina Spice)

CLASSES

Welcome to Class!

Today’s Class

Today’s Class

Presentation Transcript

MILITARY SYMBOLOGY

HP BladeSystem Virtual Connect

Phylum Mollusca

FOUNDATIONS OF ENTREPRENEURSHIP

WHAT DAY IS IT TODAY?

COMPLEX SENTENCE

SLE

THE PURPOSE FOR THIS CLASS

Hancock

Outline for today

Good morning,class!

CS 468: Advanced UNIX Class 6

***CLASS OF 2016***CLASS OF 2016*** WELCOME! PARENTS AND GUARDIANS OF CURIE'S FRESHMEN

Class 31

class notes; 4.26.11. Work and Horsepower NO Quiz TODAY

Analog Electronics Class Session 1: RC and RL Circuits (Into to Tina Spice)

CLASSES

Welcome to Class!

CLASS OF 2016CLASS OF 2016*** WELCOME! PARENTS AND GUARDIANS OF CURIE'S FRESHMEN