180 likes | 369 Views
2IJ60: Informatica 6 (Databases). Docent: dr. Natalia Sidorova (n.sidorova@tue.nl) 5 studiepunten = 140 studielasturen (18 uur colstructie, 3 uur tentamen en 119 uur zelfstudie )
E N D
2IJ60: Informatica 6 (Databases) • Docent: dr. Natalia Sidorova (n.sidorova@tue.nl) 5 studiepunten = 140 studielasturen (18 uur colstructie, 3 uur tentamen en 119 uur zelfstudie) • Rooster:colstructie: di. 1-2 u., HG 5.95 ententamen: wo. 15-03-2005, 9:00-12:00 ma. 08-05-2005, 9:00-12:00 • Informatie over het vak staat op www.win.tue.nl/~sidorova/informatica6/
Wat gaan wij leren? • Data modelleren • Hoe maak ik een formeel model van een complex systeem op basis van de tekstuele beschrijving van het systeem? • Hoe vertaal ik het model naar een tabelstructuur van de database? • Modelleermiddel: Entity-Relationship Diagrams • Queries • Hoe maak ik queries aan de database? (d.w.z. hoe vertaal ik de vraag van het Nederlands naar de query taal) • Hoe lees ik de queries die door andere mensen geschreven zijn? (d.w.z. hoe vertaal ik de query terug naar het Nederlands) • Query talen: tuple calculus, relationele algebra, SQL • Relational-Database Design • decompositie in Boyce-Codd normale vorm en derde normale vorm
Stof te behandelen • Studiemateriaal :A. Silberschatz, H.F. Korth, S. Sudarshan, "Database System Concepts" (4th Edition), McGraw-Hill, 2002. • Tijdens het cursus worden hoofdstukken 1, 2, 3, 4 en 7 behandeld. PP-presentaties zijn beschikbaar op www.win.tue.nl/~sidorova/informatica/ • Behalve opgaven uit Silberschatz zullen ook additionele opgaven behandeld worden (zie de website voor meer informatie). Om het tentamen te kunnen halen is het niet voldoende om alleen maar opgaven van het Silberschatz-niveau te kunnen oplossen.
Database Management System (DBMS) • Collection of interrelated data • Set of programs to access the data • DBMS contains information about a particular enterprise • DBMS provides an environment that is both convenient and efficient to use. • Database Applications: • Banking: all transactions • Airlines: reservations, schedules • Universities: registration, grades • Sales: customers, products, purchases • Manufacturing: production, inventory, orders, supply chain • Human resources: employee records, salaries, tax deductions • Databases touch all aspects of our lives
Database Users • Users are differentiated by the way they expect to interact with the system • Application programmers – interact with system through DML calls • Sophisticated users – form requests in a database query language • Specialized users – write specialized database applications that do not fit into the traditional data processing framework • Naïve users – invoke one of the permanent application programs that have been written previously • E.g. people accessing database over the web, bank tellers, clerical staff
Purpose of Database System • In the early days, database applications were built on top of file systems • Drawbacks of using file systems to store data: • Data redundancy and inconsistency • Multiple file formats, duplication of information in different files • Difficulty in accessing data • Need to write a new program to carry out each new task • Data isolation — multiple files and formats • Integrity problems • Integrity constraints (e.g. account balance > 0) become part of program code • Hard to add new constraints or change existing ones
Purpose of Database Systems (Cont.) • Drawbacks of using file systems (cont.) • Atomicity of updates • Failures may leave database in an inconsistent state with partial updates carried out • E.g. transfer of funds from one account to another should either complete or not happen at all • Concurrent access by multiple users • Concurrent accessed needed for performance • Uncontrolled concurrent accesses can lead to inconsistencies • E.g. two people reading a balance and updating it at the same time • Security problems • Database systems offer solutions to all the above problems
Levels of Abstraction • Physical level describes how a record (e.g., customer) is stored. • Logical level: describes data stored in database, and the relationships among the data. type customer = recordname : string;street : string;city : integer;end; • View level: application programs hide details of data types. Views can also hide information (e.g., salary) for security purposes.
Instances and Schemas • Schema – the logical structure of the database • e.g., the database consists of information about a set of customers and accounts and the relationship between them) • Physical schema: database design at the physical level • Logical schema: database design at the logical level • Instance – the actual content of the database at a particular point in time • Analogous to the value of a variable
Data Models • A collection of tools for describing • data • data relationships • data semantics • data constraints • Entity-Relationship model • Relational model • Other models: • object-oriented model • semi-structured data models • Older models: network model and hierarchical model
Entity-Relationship Model Example of schema in the entity-relationship model
Entity Relationship Model (Cont.) • E-R model of real world • Entities (objects) • E.g. customers, accounts, bank branch • Relationships between entities • E.g. Account A-101 is held by customer Johnson • Relationship set depositor associates customers with accounts • Widely used for database design • Database design in E-R model usually converted to design in the relational model (coming up next) which is used for storage and processing
Relational Model Attributes • Example of tabular data in the relational model customer- street customer- city account- number Customer-id customer- name Alma North Alma Main North Johnson Smith Johnson Jones Smith A-101 A-215 A-201 A-217 A-201 192-83-7465 019-28-3746 192-83-7465 321-12-3123 019-28-3746 Palo Alto Rye Palo Alto Harrison Rye
Tuple Relational Calculus • A nonprocedural query language, where each query is of the form {t | P (t) } • It is the set of all tuples t such that predicate P is true for t • t is a tuple variable, t[A] denotes the value of tuple t on attribute A • t r denotes that tuple t is in relation r • Find the loan number for each loan of an amount greater than $1200 {t | s loan (t[loan-number] = s[loan-number] s [amount] 1200)}
Relational Algebra • Procedural language • Six basic operators • select • project • union • set difference • Cartesian product • rename • The operators take two or more relations as inputs and give a new relation as a result. • Find the loan number for each loan of an amount greater than $1200 loan-number (amount> 1200 (loan))
SQL • SQL: widely used non-procedural language • E.g. find the name of the customer with customer-id 192-83-7465selectcustomer.customer-namefromcustomerwherecustomer.customer-id = ‘192-83-7465’ • E.g. find the balances of all accounts held by the customer with customer-id 192-83-7465selectaccount.balancefromdepositor, accountwheredepositor.customer-id = ‘192-83-7465’ anddepositor.account-number = account.account-number • Application programs generally access databases through one of • Language extensions to allow embedded SQL • Application program interface which allow SQL queries to be sent to a database
Normalization • Goals: • Decide whether a particular relation R (corresponding to a table in the database) is in “good” form. • In the case that a relation R is not in “good” form, decompose it into a set of relations {R1, R2, ..., Rn} such that • each relation is in good form • no loss of information occurs because of the decomposition • Our theory is based on functional dependencies