200 likes | 223 Views
Explore the fundamental concepts of databases, their history, characteristics, applications, and the role of Database Management Systems (DBMS) in data handling. Learn about ACID properties, database people, and the significance of using a DBMS.
E N D
CS 430Database Theory Winter 2005 Lecture 1: Introduction
What’s a Database • “Collection” of “related” “data” • Contains data about some aspect of the “real world” • Refers to a Universe of Discourse (UoD) • A “logically coherent” collection of data • Has a “specific purpose”
Typical Characteristics of Databases(1 of 3) • “Large” • Typically bigger than a spreadsheet • May be very large • Example, IRS Tax return database: • About 200M returns per year, 5 year retention • About 1K-10K bytes per return (guess) • About 1 – 10 Terabytes (without overhead) • Shared • More than single user and single application
Typical Characteristics (2 of 3) • Structured • More than a simple flat table • Self describing • Contains Metadata (data about data) describing the data contained in the database • Metadata maintained separately from applications that use and manipulate the data • Has a Catalog which is a “database” of the Metadata
Typical Characteristics (3 of 3) • Supports multiple views of the data • Different users and applications can view the data differently • ACID properties • Atomicity – Atomic transactions (updates are all or nothing) • Consistency – Enforces integrity constraints • Isolation – Transactions are isolated from each other • Durability – Data from completed transactions is never lost
A Little History of Databases (1 of 3) • Mid to late 1960s - first databases • Applications • Maintain parts data for Lunar Lander • Airline reservations • Multiple data models • Hierarchical, Network, Inverted File System • Early, mid 1970s - Relational data model • Edgar Codd – Father of Relation database • Basis for SQL (Structured or Standard Query Language)
History (2 of 3) • 1979 – Oracle Version 2 • Initial version (marketing decision) • Incomplete and slow • Late 1980s – IBM DB2 Version 1 • Used to define the SQL standard • Late 1980s – Object Oriented databases • Created to manage data for “non-traditional” applications
History (3 of 3) • 1990s – Object Relational Databases • Pioneered by Michael Stonebraker • Today • Dominant technology: Relational DBMS (RDBMS) • Oracle, MS SQL Server, IBM DB2, … • MySQL, PostgreSQL, … • OO capabilities being added to RDBMS • New: Object-Relational Mapping Software • Try to handle “impedance mismatch” between RDBMS and OO programming languages
Database Applications (1 of 2) • Traditional • Business applications • Personnel, accounting, ... • Student and Course data • Traditional data types • Numbers, strings, dates • Data warehousing • Large “historical” databases for analytic support • Manufacturing Control • Real-time issues
Database Applications (2 of 2) • Non-traditional • Image and Video • GIS (Geographic Information Systems) • Engineering • CAD (Computer Aided Drafting or Design) • Time Series • Stock market data • Full text search • Environmental and Remote Sensing
Data Base Management System (DBMS) • Software that manages and or facilitates • Data definition • E.g. creating and maintaining the catalog • Data construction • E.g. loading data into the database • Data manipulation • Applications retrieving and updating the database • Data sharing • ACID properties
DBMS In Context Users/Programmers Application Programs External Queries Database System DBMS Software Query Processing Application Program Interface Access/Update Stored Data Metadata Catalog The Data Elmasri and Navathe, Figure 1.1, Page 6
Database People (Actors) (1 of 2) • Data Administrator • Responsible for correctness of the data • Database Administrator • Configure DBMS, manage data storage, DBMS performance tuning • Database Designer • Design the database • All three of these may be same person or group of people
Database People (2 of 2) • Application Analysts and Developers • Responsible for analyzing, designing, building, and maintaining database applications • End Users • Use the database to accomplish useful work
Why use a DBMS? (1 of 2) • Manage redundancy • If the same data is stored multiple times (often enough, without periodic reconciliation) it is guaranteed to be inconsistent • Access Control • Not all the users can view and/or update all the data • Persistent storage of program data • Rather than having to implement your own DBMS internal to your application
Why a DBMS? (2 of 2) • Efficiency • DBMS vendors have done a lot of work to make their products work efficiently • Mixed blessing (see “Why not to use a DBMS?”) • Enforce integrity constraints • Defined and enforced once • Share data • Among multiple applications, GUIs, users • ACID Properties • Difficult to implement correctly
Why not to use a DBMS? • Learning curve • “It takes four years to learn to be an Oracle DBA” • Overhead costs (time and space) • Generality • Concurrency and transactions • Multiple application and user access • Complex data structures • Rule of thumb: Using an RDBMS doubles the space required for the data (e.g. versus a text file)
Course Administration • Course web site • http://faculty.cs.wwu.edu/reedyc/CS_430_Winter_2005 • Email: Chris.Reedy@wwu.edu • Textbook • Elmasri, Navathe, Fundamentals of Database Systems, Fourth Edition • Assignments • Use MySQL • Most convenient form of access? • Get hands dirty: • Design a database • Create database and load the data • Write a database application
Course Outline (1 of 2) • Introduction to Databases • Chapters 1 and 2 • Introduction to Data Modeling • Chapter 3 (partial) • Relation Data Model, Algebra, and Calculus • Chapters 5, 6 • Functional Dependencies and Normalization • Chapters 10 and 11 (partial)
Course Outline (2 of 2) • SQL Database Programming • Chapters 8 and 9 • Entity-Relationship Modeling • More of chapters 3, 4, and 7 • Overview: What’s inside a DBMS? • CS530, Chapters 13-19 • Overview of additional topics • Object-Oriented and Object Relational DBMSs • XML in Databases