310 likes | 324 Views
Learn about the basics of database management systems, including data models, schemas, and query languages. Understand why DBMS is essential for efficiently managing large amounts of data.
E N D
CS3431 –Database Systems I Introduction Instructor: Mohamed Eltabakh meltabakh@cs.wpi.edu
Today’s Lecture • Overview on Database Management Systems • Course Logistics
What is a Database System? • Software platform for managing large amounts of data • Managing means: • Storing, querying, indexing, and structuring the data • Different names refer to the same thing: • Database systems • Database management systems • DBMS
What is a Database System? (Cont’d) • What’s inside a DBMS • Collection of interrelated data (E.g., for a given application) • Set of programs to secure and access the data • An environment that is both convenientand efficientto use • Usually data is too large to fit in computer memory at once • Data stored on disk • Usually many users want to access this data and do so fast • Databases touch all aspects of our lives. We use it without knowing !!!
? Database Applications Have you ever used a database application? • E-commerce: books, equipment etc. at Amazon • Banks -- your valuable $$ and ATM transactions • Airlines – manage flights to get you places • Universities – manage student enrollment • GIS (Maps) – find restaurants closest to WPI • Bio-informatics (genome data) Data is everywhere. To efficiently manage it, we need DBMS
Why use DBMS, and not files? Several drawbacks of using file systems • Data redundancy and inconsistency • Multiple file formats, duplication of information in different files • Multiple records formats within the same file • No order enforced between fields • Difficulty in accessing data • Need to write a new program to carry out each new task • No indexes, always scan the entire file • Integrity problems • Modify one file (or field in a file), and not changing the dependent fields or files • Integrity constraints (e.g., account balance > 0) become “buried” in program code rather than being stated explicitly
Why use DBMS, and not files? (Cont’d) • Concurrent access by multiple users • Many users need to access/update the data at the same time (concurrent access) • Uncontrolled concurrent access can lead to inconsistencies • Example: Two people are updating the same bank account at the same time • Security problems • Hard to provide user access to some, but not all, data • Recovery from crashes • While updating the data the system crashes • Maintenance problems • Hard to search for or update a field • Hard to add new fields
DBMS Provides Solutions • Data consistency even with multiple users • Efficient access to the data • Data integrity embedded in the DBMS • Recovery from crashes, security
Basic Terminology • Data Model • Tools used for describing the data • Data Schema • Describes structures for a particular application, using the given model • Database • Collection of actual data that conforms to given schema • Database Management System (DBMS) • Software platform that allows us to create, stores, use, and maintain a database • Data Manipulation Language (DML) • Language to manipulate, e.g., update or query, the data
Data Model • A collection of tools for describing • Data • Data relationships • Data semantics • Data constraints • Several data models: • Relational model • Entity-Relationship (ER) data model • Object-based data models (Object-oriented) • Semi-structured data model (XML) • Other older models: • Network model • Hierarchical model We will learn these two models
Example: ER Model • Graphical model for describing entities, attributes, and relationships
Data Schema • Captures the relationships between objects (“entities”) in an application • Schemas can be represented graphically or textual
Query Language (SQL) • Language for accessing and manipulating the data organized by the appropriate data model • SQL: Structured Query Language SELECTID, Name FROM Student WHEREaddress=“320FL”;
Query Language • Two classes of languages • Procedural– user specifies what data is required and how to get those data • Declarative (non-procedural) – user specifies what data is required without specifying how to get those data • DBMSs use declarative language (SQL) SELECTID, Name FROM Student WHEREaddress=“320FL”;
You will Learn • Data Model • Relational Model • Entity-Relationship (ER) Model • Data Schema • How to put pieces together to build a schema describing the application • Database • Build an actual database and manipulate data • Database Management System (DBMS) • We will use Oracle • Query Language • SQL Language
Relational Data Model: Overview • The most widely used model today • It is a tabular representation of the data • Main concepts: • Relations (Tables),basically a table with rows and columns. • Every relation has a schema, which describes the columns, or fields. Field or attribute
Example Database : Relational Tabular View of Data in Airline System Flight Passenger Travel • Tabular view of data is called “Relational Model”
Entity-Relationship Model: Overview • Models the application as a collection of entities and relationships • Represented using Entity-Relationship Diagram (ERD)
SQL: Overview • SQL: Non-procedural language to access the data inside a database • External programs, e.g., in C or Java, typically access the database using: • Language extensions to allow embedded SQL • ODBC: Open Database Connectivity • JDBC: Java Database Connectivity
Logical vs. Physical How this information is stored???
Levels of Abstraction • View Level --describes how users see the data • Logical Level – describes the logical structures used • Relational Model • ERD model • Physical Level -- describes files and indexes Usually hidden from users
Levels of Abstraction: Airline Application Example • Logical (Conceptual) Level • Flight, Passenger, Travel tables • Physical Level • Flight table stored as a sorted file on the flight number • Index on flightNo attribute for Flight relation • View Level (External Schema) • NoOfPassengers (flightNo, date, numPassengers) • Hide employees salary These levels of abstraction lead to “Data Independence”
Data Independence • DBMS has the three levels of abstractions • Ability to modify one level without affecting the other levels • Physical data independence: • Physical schema such as indexes can change, but logical schema need not change • Protection from changes in physical structure of data • Logical data independence: • Logical schema can change, but views need not change • Protection from changes in logical structure of data
Other Advanced Aspects of DBMSs • Efficient access • Query optimization • Concurrency control • Recovery control >> We will not have time to study these subjects during the course >> It is important to know their existence and what is meant by each component
Efficient Access • Indexing • Indexes gives direct access to “necessary” portion of data, as opposed to sequential access in files Directly find this customer without scanning all customers
Query Optimization • Costing: • Estimate expected execution times • Query optimization : • Generates many alternatives to answer a query • Estimates the cost of each alternative • Automatically determine and prepare optimal (or near optimal) access plans for getting the data SELECTID, Name FROM Student WHEREaddress=“320FL”; Optimizer = “The Bread and Butter of a DBMS !”
Concurrency Control • DBMS ensures data is consistent under concurrent access • E.g.: multiple airline staff trying to reserve a seat for different customers • Concepts: • Transactions – grouping multiple instructions (reads/writes) into one atomic unit • Locks – locking of resources (tables)
Recovery Control • If system crashes in middle of transaction, recovery must be provided : • Cannot afford to loose data or leave it inconsistent • Concepts: • Logging of transactions’ actions • Ability to redo or undo transactions
Who uses databases? • End users • DB application programmers • Database Administrators • Database design • Security, Authorization • Data availability, crash recovery • Database tuning (for performance)
Summary : Why study DBMS? • Need to process large amounts of data efficiently • Video, WWW, computer games, geographic information systems (GIS), genome data, digital libraries, etc. • Make use of all functionalities provided by DBMSs • DB administrators and programmers hold rewarding jobs • DB research is one of the most exciting areas in Computer Science !!