210 likes | 220 Views
Chapter 1: Introduction. Database Management System (DBMS). DBMS contains information about a particular enterprise Collection of interrelated data Set of programs to access the data An environment that is both convenient and efficient to use Database Applications:
E N D
Database Management System (DBMS) DBMS contains information about a particular enterprise Collection of interrelated data Set of programs to access the data An environment that is both convenient and efficient to use Database Applications: Banking: transactions Airlines: reservations, schedules Universities: registration, grades Sales: customers, products, purchases Online retailers: order tracking, customized recommendations Manufacturing: production, inventory, orders, supply chain Human resources: employee records, salaries, tax deductions Databases can be very large. Databases touch all aspects of our lives
University Database Example Application program examples Add new students, instructors, and courses Register students for courses, and generate class rosters Assign grades to students, compute grade point averages (GPA) and generate transcripts In the early days, database applications were built directly on top of file systems
Drawbacks of using file systems to store data Data redundancy and inconsistency Multiple file formats, duplication of information in different files Difficulty in accessing data Need to write a new program to carry out each new task Data isolation — multiple files and formats Integrity problems Integrity constraints (e.g., account balance > 0) become “buried” in program code rather than being stated explicitly Hard to add new constraints or change existing ones
Drawbacks of using file systems to store data (Cont.) Atomicity of updates Failures may leave database in an inconsistent state with partial updates carried out Example: Transfer of funds from one account to another should either complete or not happen at all Concurrent access by multiple users Concurrent access needed for performance Uncontrolled concurrent accesses can lead to inconsistencies Example: Two people reading a balance (say 100) and updating it by withdrawing money (say 50 each) at the same time Security problems Hard to provide user access to some, but not all, data Database systems offer solutions to all the above problems
Data Models A collection of tools for describing Data Data relationships Data constraints A data model is not just a way of structuring data: it also defines a set of operations that can be performed on the data A database model is the theoretical foundation of a database fundamentally determines in which manner data can be stored Organized Manipulated It defines the infrastructure offered by a particular database system. The most popular example of a database model is the relational model
Data Models Example of database model Relational model Object-based data models (Object-oriented and Object-relational) Other older models: Network model Hierarchical model
Relational Model collection of data items organized as a set of formally-described tables data can be accessed or reassembled in many different ways without having to reorganize the database tables was invented by E. F. Codd at IBM in 1970 store data in 2 dimensional: rows & columns
A Sample Relational Database Fields / attributes Records / tuples
Object Oriented Database model • is a database management system (DBMS) that supports the modelling and creation of data as objects • Object Oriented Databases (ODBMS) store data together with the appropriate methods for accessing it i.e. encapsulation • An extension of RDBMS • Best used when there is complex data and/or complex data relationships
Relational database of a cat Object-oriented database of a cat
Object-Relational Data Models Relational model: flat, “atomic” values Object Relational Data Models Extend the relational data model by including object orientation and constructs to deal with added data types. Allow attributes of tuples to have complex types, including non-atomic values such as nested relations. Preserve relational foundations, in particular the declarative access to data, while extending modeling power. Provide upward compatibility with existing relational languages.
Network Database Model • a flexible way of representing objects and their relationships • Some data were more naturally modeled with more than one parent per child. • hence, the network model permitted the modeling of many-to-many relationships in data
Hierarchical data model • The hierarchical data model organizes data in a tree structure. • The structure allows representing information using parent/child relationships: • Each parent can have many children but each child only has one parent (also known as a 1:many ratio )
Database Design The process of designing the general structure of the database: Logical Design – Deciding on the database schema. Database design requires that we find a “good” collection of relation schemas. Business decision – What attributes should we record in the database? Computer Science decision – What relation schemas should we have and how should the attributes be distributed among the various relation schemas? Physical Design – Deciding on the physical layout of the database
Database Design? Is there any problem with this design?
Terminologies • A simple key contains a single attribute. • A composite key is a key that contains more than one attribute. • A candidate key is an attribute (or set of attributes) that uniquely identifies a row. A candidate key must possess the following properties: • Unique identification - For every row the value of the key must uniquely identify that row. • Non redundancy - No attribute in the key can be discarded without destroying the property of unique identification. • A primary key is the candidate key which is selected as the principal unique identifier. Every relation must contain a primary key. The primary key is usually the key selected to identify a row when the database is physically implemented. For example, a part number is selected instead of a part description.
Terminologies • A superkey is any set of attributes that uniquely identifies a row. A superkey differs from a candidate key in that it does not require the non redundancy property. • A foreign key is an attribute (or set of attributes) that appears (usually) as a non key attribute in one relation and as a primary key attribute in another relation. I say usuallybecause it is possible for a foreign key to also be the whole or part of a primary key: • A many-to-many relationship can only be implemented by introducing an intersection or link table which then becomes the child in two one-to-many relationships. The intersection table therefore has a foreign key for each of its parents, and its primary key is a composite of both foreign keys. • A one-to-one relationship requires that the child table has no more than one occurrence for each parent, which can only be enforced by letting the foreign key also serve as the primary key.
History of Database Systems 1950s and early 1960s: Data processing using magnetic tapes for storage Tapes provided only sequential access Punched cards for input Late 1960s and 1970s: Hard disks allowed direct access to data Network and hierarchical data models in widespread use Ted Codd defines the relational data model Would win the ACM Turing Award for this work IBM Research begins System R prototype UC Berkeley begins Ingres prototype High-performance (for the era) transaction processing
History (cont.) 1980s: Research relational prototypes evolve into commercial systems SQL becomes industrial standard Parallel and distributed database systems Object-oriented database systems 1990s: Large decision support and data-mining applications Large multi-terabyte data warehouses Emergence of Web commerce Early 2000s: XML and XQuery standards Automated database administration Later 2000s: Giant data storage systems Google BigTable, Yahoo PNuts, Amazon, ..