320 likes | 428 Views
Database Introduction Chapter 1. CS 5800. Acknowledgements. These slides were written by Richard T. Snodgrass (University of Arizona), Curtis Dyreson (Utah State University) and Christian S. Jensen (Aalborg University).
E N D
Database IntroductionChapter 1 CS 5800
Acknowledgements • These slides were written by Richard T. Snodgrass (University of Arizona), Curtis Dyreson (Utah State University) and Christian S. Jensen (Aalborg University). • Kristian Torp (Aalborg University) converted the slides from Island Presents to Powerpoint. • Sabah Currim added some slides. Introduction
Outline • File-based approach vs. Database Approach • Time Line for Database Technologies • Architecture of Database Systems Introduction
Basic Definitions • Miniworld: Some part of the real world about which information is stored. Also called the Universe of Discourse (UoD). • Data: Known facts about the miniworld • recorded • have an implicit meaning • Information: data processed to be useful in decision making • Metadata: data that describes the properties or characteristics of other data, e.g. the header of a table Introduction
Example (From Modern Database Management) Information Data Metadata Identify data, information and metadata in video store example Introduction
File-based Systems - Review rented tape file inventory master file customer file movie name, upc, copy number, … customer name, movie name, copy number, due date, … customer name, phone, address, … Introduction
Maximal File-based System In how many places do a customer name, movie name and copy number appear in the system? tape rental check-in program new tape ordering program customer mailing program rented tape file inventory master file inventory master file rented tape file customer file customer file customer name, movie name, copy number, due date, movie name, upc, copy number, customer name, phone, address,... movie name, upc, copy number,… customer name, movie name, copy number, due date, customer name, phone, address, … Introduction
Minimal File-based System tape rental check-in program new tape ordering program customer mailing program In how many different places does a customer name, movie name and copy number appear in the system? rented tape file inventory master file customer file persistent storage customer name, movie name, copy number, due date, movie name, upc, copy number, customer name, phone, address,... Introduction
Complications for File-based Approach What were the seven complications? • Queries • Integrity • Update • Multiple users • Data physically separate • Security • Evolving needs Introduction
Limitations of File-based Systems • Program must implement • Security • Concurrency control • Support for schema reorganization • Observation • Many applications need these services. • Solution • Build and sell a software system to provide services! i.e. Database Management Systems Introduction
Equivalent Database System tape rental check-in program new tape ordering program customer mailing program In how many different places does a customer name, movie name and copy number appear in the system? DBMS rented inventory customer persistent storage Introduction
Characteristics of Database Approach • Self-describing nature • System catalog stores schema • Data abstraction • In O-O design method (interface) vs. operation (implementation) • In database design, data model vs. implementation • Multiple views • A (virtual) view is • result of a query • same data, rearranged differently • Sharing of data by multiple readers/writers • OLTP • concurrent transactions - need concurrency control Introduction
Pictorial Representation Users/Programs DATABASE SYSTEM Application Programs/Queries DBMS SOFTWARE Software to Process Queries/Programs Software to Access Stored Data Stored Database Definition (Meta-Data) Stored Database Introduction
Functions of a DBMS • Provides persistent, shared storage • Objects live beyond program execution • Shared by multiple applications • Multiple copies of data leads to • inconsistency • duplicated effort • Common backup and recovery • Integrity constraints • DBMS reduces redundancy in • Development and maintenance • Provides multiple interfaces • Query language, embedded query language, APIs, GUIs • Protects against • Software/hardware failure • Security breaches Introduction
When not to use a DBMS • Main costs of using a DBMS • High initial investment • May need additional hardware • Overhead • Training • When a DBMS may be unnecessary • Application is simple, well-defined, and not expected to change. • Stringent real-time requirements • (Write) access to data by multiple users is not needed Introduction
Classes of DB Users – Workers On the Scene • Persons whose job involves daily use of a large database. • Database administrators (DBAs) • Responsible for managing the database system. • Database designers • End users • The people that use the database for querying, updating, generating reports • Interactive users: Use full DBMS capabilities directly via a DML. • Parametric (or naive) end users: They use pre-programmed canned transactions to interact continuously with the database. For example, bank tellers or reservation clerks. • Application programmers • Design and implement canned transactions for parametric users. Introduction
DBA - Duties • Chooses • Information content of the database • Storage structure and access strategy • Performance-enhancing data structures • Acts as liaison with users • Security czar • Defines authorization checks and validation procedures • Responsible for • Backups and recovery • Monitoring performance • Updates (to schema) Introduction
Database People • People who design and develop the DBMS software • DBMS designers and implementers • Tool developers • Design and implement tools that facilitate the use of DBMS software. Tools include design tools, performance tools, special interfaces, etc. • Operators and maintenance personnel • Run and maintain the hardware and software environment for the database system. Introduction
Interfaces • Menu vs. form-based • GUI • Canned interfaces for parametric users • DBA • Application • Natural language • Web search engines • Shell Introduction
Input (Using a Form) Introduction
Loans for Ages 20–30 SID NAME AMOUNT 9735 Allen $1,200 $1,000 $2,000 TOTAL $3,200 8767 Cabeen $1,900 TOTAL $1,900 2368 Jones $0 TOTAL $0 3749 Watson $2,500 $3,000 TOTAL $5,500 page 1 Output (Using a Report) Introduction
Data Models • A data definition language (DDL) describes database schemas. • Data relationships • Data semantics • Integrity constraints • Database schemas vs. instances • Similar to types and variables in programming languages • A data manipulation language (DML) is used for querying and updating database instances. • A data model is a data definition language along with a data manipulation language. • Conceptual • Representational • Physical Introduction
Schema vs. Instance • Schema - Description of how data is organized and constrained. • Instance - The data in a database (conforms to a schema). • Snapshot - Database state at a particular point in time. • Initially empty • Database is populated or loaded • DBMS ensures every state is a valid state. • Schema evolution vs. data update Introduction
Outline • File-based Approach vs. Database Approach • Time Line for Database Technologies • Architecture of Database Systems Introduction
Time Line for Database Technologies (before 1950) Papyrus, Parchment Clay tablets 6000 years ago! Paper 1890, Punch cards Introduction
Evolution of Database Technologies (From Modern Database Management) Introduction
Types of data models • Record-based conceptual models • (1960s) hierarchical model (e.g., IBM IMS) • (1970s) network model (e.g., CA-IDMS) • (1980s - current) relational model (e.g., Oracle, Microsoft Access, Microsoft SQLServer, IBM DB2) • Object-based conceptual models • (late 1980s - current) entity-relationship model • (late 1980s - current) object-oriented model • object-relational • Text-based conceptual models • (late 1990s - current) XML Introduction
ANSI Three-Schema Architecture • Supports DBMS characteristics of: • Program-data independence. • Support of multiple views of the data. • Defines DBMS schemas at three levels: • Physical • How data is stored on disk • Data storage structures • Access paths to the data • Logical • How we think the data is organized • Conceptual structure • Integrity constraints • External (view) • What a user sees of the data • View is often limited by security Introduction
View Level Logical Level Physical Level The Three-Level Architecture Introduction
First Name Last Name Salary StaffID Name Birthdate staff_no fname lname dob branch_no salary Schema and Mappings external level conceptual level physical level Struct STAFF { int staff_no; int branch_no; char fname [15]; char lname [15]; struct date dob; float salary; struct STAFF *next; }; index staff_no; index branch_no; Introduction
Data Independence • Each level is “independent” in the sense that a completely different organization can be used. • Physical data independence - Physical level can change without having to change the logical level. • Logical data independence - Logical level can change without having to change the external level. Introduction
Special Kinds of Databases • Temporal databases • Special handling for time • Few commercial temporal databases • Spatial databases • Maps, cadastral applications • Many commercial products (GIS) • Text databases • Special text search capabilities • Library collections • Statistical databases • Census data • OLAP, data warehousing • Federated, heterogeneous, distributed databases Introduction