250 likes | 464 Views
ITMA 322 : Database Management Systems. Akram Jalal Ahlia University MIS College of Business and Finance. COURSE IDENTIFICATION LECTURES : M 16:30 – 18:10 - Class 6. Lab : W 16:30 – 18:10 - Lab 1. INSTRUCTOR : Dr. Akram Jalal Office : Email : akarim@ahlia.edu.bh Phone :
E N D
ITMA 322: Database Management Systems Akram Jalal Ahlia University MIS College of Business and Finance
COURSE IDENTIFICATION • LECTURES: M 16:30 – 18:10 - Class 6. • Lab: W 16:30 – 18:10 - Lab 1. • INSTRUCTOR: Dr. Akram Jalal • Office: • Email: akarim@ahlia.edu.bh • Phone: • Office Hours: Tuesdays, 10am to 11:30am Walk-ins are welcome any time based on availability Dr. Akram Jalal
COURSE IDENTIFICATION (cont) ASSIGNMENTS: • Submissions will be in class. Late submissions will be penalized by at the rate of 10% per day. Assignments later than three days will not be accepted. • Assignment should be written using computer tool. Hand writing home works are not accepted. • All students should achieve their own work according to the Ahlia University rules, which means you must write up your favourite solution independently. Exams: • The midterm exam will be given in-class on ----. This exam will covers material and chapters (1, 2 and 5). • The date of final exam is based on the final exam timetable.
GRADING: Attendance: 10% Assignment: 20%, Midterm: 30%, Final: 40%. Consistent class attendance is required absences will seriously affect your total grade.
Textbook:Database Management Systems • by RaghuRamakrishnan & Johannes Gehrke • 3rd edition • Available: Next week in the library • The readings of this book are required, even if material is not covered in the lectures. All other readings are optional.
What Is a Database? • A large, integrated collection of data, typically describing the activities of one or more organizations. • Models a real-world enterprise, e.g. a university database might contain information about: • Entities such as: • students, courses, instructors, • Relationships • One to One • One to Many • Many to Many • A Database Management System (DBMS)is a software designed to assist in maintaining and utilizing large collection of data. • DBMS also needed to store the data in files and write application-specific code to manage it.
Why Use a DBMS? • Data independence. DBMS provides an abstract view of the data that hides data representation and storage. When the DBMS hides certain details of how data is stored and maintained, it provides what is called as the abstract view of data. Complexity (of data and data structure) is hidden from users through several levels of abstraction. • Efficient data access. DBMS utilizes a variety of sophisticated techniques to store and retrievedata efficiently. • Data integrity and security. If data is always accessed through DBMS, the DBMS can enforce integrity constrains. E.g.1. Before inserting salary, DBMS should check that department budget is not exceeded. E.g.2 access control. • Uniform data administration. When several users share the data, centralizing the administration of data can offer the significant improvement. • Concurrent access, recovery from crashes. DBMS schedules concurrent access to the data in such a manner that users can think of the data as being accessed by only one user at a time. • Reduced application development time: DBMS supports important functions that are common to many applications accessing data in DBMS
Problems without a DBMS... 1. Large data sets (100s of GBs, or TBs, …) • No indices • Finding “George” in huge flat file is expensive • Modifications intractable without better data structures • “George” “Georgie” is very expensive • Deletions are very expensive
Problems without an DBMS... 4. Security? • File system may lack security features • File system security may be rough 5. Application programming interface (API)? • Interfaces, interoperability 6. How to query the data?
Data Models • A data modelis a collection of high level data description constructs. Most of DBMS today are based on relational data model. • Semantic data model (such as E-R diagram) is more abstract, highlevel data model and served as starting point and subsequently translated into a database design. E-R diagram allows us to periodically denote entities and the relationships among them. • Aschemais a description of a particular collection of data, using the a given data model. e.g. Student (Sid: string, Name: String, age: Integer) • The relationaldata modelis the most widely used model today (which is used in numerous systems such as Oracle, DB2, MS Access) is based on relation (which described as a set of records) , basically a table with rows and columns. • Every relation has a schema, which describes the fields (columns, or attributes). • Other data models are: Hierarchical model, the network model, object-oriented model and object-relational model.
Each external schema consists of one or more relations and view Levels of Abstraction External schema –View1 External schema – view 2 External schema - view3 User 1 User 2 User 3 User 4 Conceptual Schema (one per each DB) • Data in DBMS is described at three levels of abstractions. • Many External echemas, single conceptual (logical) schemaand physical schema. • Views describe how users see the data. • Conceptual schema defines logical structure. In RDBMS CS describes all relations that are stored in the database. Student(sid: string, Name: String, age: Integer) Faculty ( ….. • Physical schema describes the files and indexes used. It summarizes how the relations described in the conceptual schema are actually stored on secondary storage devices such as disks and tapes. Physical Schema (one per each DB) External schema allow data access to be customized (and authorized). DDL (data definition Language) used to define external and conceptual schemas). Concerned with the way individual users see the data. Information about those schemas are stored in system catalogs.
How the programmer sees the DBMS CREATE TABLE Students ( Name CHAR(30) SSN CHAR(9) PRIMARY KEY NOT NULL, Category CHAR(20) ); • Start with SQL DDL to create tables: • Continue with SQL to populate tables: INSERT INTO Students VALUES('Hillary', '123456789', 'undergraduate');
How the programmer sees the DBMS Takes: Students: • Ultimately files, but complex Courses:
Querying: Structured Query Language SELECT SSN FROM Takes WHERE CID='C20.0046'; • Find all the students who have taken C20.0046: • Find all the students who have taken C20.0046 previously: • Find the students’ names: SELECT SSN FROM Takes WHERE CID='C20.0046' AND Semester='Fall, 2005'; SELECT Name FROM Students, Takes WHERE Students.SSN=Takes.SSN AND CID='C20.0046' AND Semester='Fall, 2005';
Schema e.g.: University registrar • Conceptual Schema: • Students(ssn: string, name: string, login: string, age: int, gpa: real) • Courses(cid: string, cname: string, credits: int) • Enrolled(sid:string, cid:string, grade: string) • Physical schema: • Relations stored as unordered text files. • Indices on first column of each relation • Views: • My_courses(cname: string, grade: string, credits: int) • Course_info(ssn: string, name: string, status: string)
system must support • failover/rovery • concurrent use • deal with large datasets? • security • interop? • querying in what? • DBMS as application • Q: How does a DBMS solve these problems? • A: See third part of course, but for now…
Transaction processing • Consider a database that holds information about airline reservations. At any time, it is possible that several travel agents are looking up information about available seats on various flights and making new seat reservations. • When several users access a database concurrently, the DBMS must order their requests to avoid conflicts. • ACID test properties • Atomicity • all or nothing • Consistency • constraints on relationships • Isolation • concurrency control • simulated solipsism • Durability • Crash recovery
Transaction: An Execution of a DB Program • Key concept is transaction, which is an atomicsequence of database actions (reads/writes). • Each transaction, executed completely, must leave the DB in a consistent stateif DB is consistent when the transaction begins.
Atomicity & Durability • Avoiding inconsistent state • A DBMS prevents this outcome • Xactsare all or nothing • One simple idea: log progress of and plans for each xact • Durability: changes stay made (with log…) • Atomicity: entire xact is committed at once
Isolation • Many users concurrent execution • Disk access is slow (compared to CPU) • don’t waste CPU – keep running • Interweaving actions of different user programs • but can lead to inconsistency: • e.g., two programs simultaneously withdraw from the same account • For each user, should look like a single-user system
Consistency • Each xact (on a consistent DB) must leave it in a consistent state
Concurrency Control • Concurrent execution of user programs is essential for good DBMS performance. • Because disk accesses are frequent, and relatively slow, it is important to keep the cpu humming by working on several user programs concurrently. • Interleaving actions of different user programs can lead to inconsistency: e.g., check is cleared while account balance is being computed. • DBMS ensures such problems don’t arise: users can pretend they are using a single-user system.
Course outline • Database design: • Entity/Relationship models • Modeling constraints • The relational model: • Relational algebra • Transforming E/R models to relational schemas • SQL • DDL & query language
End chapter exercises • Why would you choose a database system instead of simply storing data in operating system files? When would it make sense not to use a database system? • Explain the difference between external, internal, and conceptual schemas. How are these different schema layers related to the concepts of logical and physical data independence? • Which of the following plays an important role in representing information about the real world in a database? Explain each of briefly. 1. The data definition language (DDL). 2. The data manipulation language (DML). 3. The buffer manager. 4. The data model (DM).