1 / 50

IT420: Database Management and Organization

IT420: Database Management and Organization. Adina Cr ă iniceanu adina@usna.edu. Instructor. Adina Crainiceanu M.S. and Ph.D. Cornell University Area of Specialization: Databases Research: search in peer-to-peer systems. Database Management and Organization.

Download Presentation

IT420: Database Management and Organization

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. IT420: Database Management and Organization Adina Crăiniceanu adina@usna.edu

  2. Instructor • Adina Crainiceanu • M.S. and Ph.D. Cornell University • Area of Specialization: Databases • Research: search in peer-to-peer systems

  3. Database Management and Organization • How does Wal-Mart manage its 200 TB data warehouse? • What is the database technology behind ebay’s website? • How do you build an Oracle 9i, IBM DB2 or Microsoft SQL Server database?

  4. Course Goals • Understand the functionality of modern database systems • Understand where database systems fit into an enterprise data management infrastructure • Design and build data-driven applications websites • Learn several important technologies: • SQL, PHP, XML, XQuery, web services

  5. Course Workload • Labs + Lectures • Grade: • 25%: Final Exam • 30%: 6-Week and 12-Week Exams • 20%: Homeworks, Labs, Quizes • 20%: Projects • 5%: Class Participation

  6. Evaluation Policies • Assignments: • No late submissions • Exams: comprehensive, closed book/ closed notes • Re-grade requests: up to 7 days after grade

  7. Academic Integrity - Honor • Honor Concept of the Brigade of Midshipmen • Policies Concerning Graded Academic Work • USNA • CS • http://www.cs.usna.edu/academics/honor.htm • Collaboration on homeworks is possible, but submitted work should be your own. • Cite any assistance, from any sources • Collaboration on projects, exams, quizzes is prohibited

  8. Resources • Textbook: Database Processing by David Kroenke • Database Management Systems by R. Ramakrishnan and J. Gehrke • MySQL/PHP Database Applications by B. Bulger • Microsoft Access reference book • Lecture slides • Course website: www.cs.usna.edu/~adina/teaching/it420spring2006

  9. Classroom • No food permitted in classroom • No use of computer equipment for any purpose other than as outlined in the class activity

  10. Course Topics • Database design • Relational model • SQL • Normalization • Database administration • PHP, MySQL • XML • Three-tier concepts

  11. Database Management Systems (DBMS) • Information is one of the most valuable resources in this information age • How do we effectively and efficiently manage this information? • Relational database management systems • Dominant data management paradigm today • 6 billion dollars a year industry!

  12. Why not Files?

  13. double charge string JobName class Equipment class Date class Contractor Classes

  14. ‘Query Processing’ class Rental { public: string job; Contractor Con_data; Equipment Equip_data; Date rent_data; double charge; }; Q: All jobs with Charge > x? A: Rental allRentals[10]; changesGreaterThan(double x){ for(i…){ if (allRentals[i].charge > x) cout << … } }

  15. Problems • Changes to Data • Data inconsistencies • Access Control • Security of information (views) • Loss of info due to deletion • “on the fly” Queries?

  16. Why Database Management Systems? • Benefits • Transactions (concurrent data access, recovery from system crashes) • High-level abstractions for data access, manipulation, and administration • Data integrity and security • Performance and scalability

  17. What is a Transaction? The execution of a program that performs a function by accessing a database. • Examples: • Reserve an airline seat. Buy an airline ticket. • Withdraw money from an ATM. • Verify a credit card sale. • Order an item from an Internet retailer.

  18. Transactions • A transaction is an atomic sequence of actions • Each transaction must leave the system in a consistent state (if system is consistent when the transaction starts). • The ACID Properties: • Atomicity • Consistency • Isolation • Durability

  19. Example Transaction: Online Store Your purchase transaction: • Atomicity: Either the complete purchase happens, or nothing • Consistency: The inventory and internal accounts are updated correctly • Isolation: It does not matter whether other customers are also currently making a purchase • Durability: Once you have received the order confirmation number, your order information is permanent, even if the site crashes

  20. Transactions (cont.) • A transaction will commitafter completing all its actions, or it could abort(or be aborted by the DBMS) after executing some actions.

  21. Example Transactions: ATM • You withdraw money from the ATM machine • Atomicity • Consistency • Isolation • Durability • Commit versus Abort?

  22. What Makes Transaction Processing Hard? • Reliability - system should rarely fail • Availability - system must be up all the time • Response time - within a few seconds • Throughput - thousands of transactions/second • Scalability - start small, ramp up to Internet-scale • Security – for confidentiality and high finance • Configurability - for above requirements + low cost • Atomicity - no partial results • Durability - a transaction is a legal contract • Distribution - of users and data

  23. What Makes TP Important? • It is at the core of electronic commerce • Most medium-to-large businesses use TP for their production systems. • It is a huge slice of the computer system market

  24. Why Database Management Systems? • Benefits • Transactions (concurrent data access, recovery from system crashes) • High-level abstractions for data access, manipulation, and administration • Data integrity and security • Performance and scalability

  25. Data Model • A data model is a collection of concepts for describing data. • Examples: • ER model (used for conceptual modeling) • Relational model, object-oriented model, object-relational model (actually implemented in current DBMS)

  26. The Relational Data Model • A relational database is a set of relations. • Turing Award (“Nobel Prize” in CS) for Codd in 1980 • Example relation: • Student(cid: integer, name: string, byear: integer, state: string)

  27. The Relational Model: Terminology • Relation instance and schema (table) • Field (column) • Record or tuple (row) • Primary key • Foreign key

  28. The Object-Oriented Data Model • Richer data model. Goal: Bridge mismatch between programming languages and the database system. • Example components of the data model: • Relationships between objects directly as pointers. • Result: Can store abstract data types directly in the DBMS • Pictures • Geographic coordinates • Movies • CAD objects

  29. Object-Oriented DBMS • Advantages: • Engineering applications (CAD and CAM and CASE computer aided software engineering), multimedia applications. • Disadvantages: • Querying is much harder

  30. Object-Relational DBMS • Mixture between the object-oriented and the object-relational data model • Combines ease of querying with ability to store abstract data types • Conceptually, the relational model, but every field • All major relational vendors are currently extending their relational DBMS to the object-relational model

  31. Query Languages • We need a high-level language to describe and manipulate the data • Requirements: • Precise semantics • Easy integration into applications written in C++/Java/Visual Basic/etc. • Easy to learn • DBMS needs to be able to efficiently evaluate queries written in the language

  32. SQL: Structured Query Language • Developed by IBM (System R) in the 1970s • ANSI standard since 1986: • SQL-86 • SQL-89 (minor revision) • SQL-92 (major revision, current standard) • SQL-99 (major extensions) • More about SQL in later lectures

  33. Example Query SELECT Customers.cid, Customers.name, Customers.byear, Customers.state FROM Customers WHERE Customers.cid = 3

  34. Why Database Management Systems? • Benefits • Transactions (concurrent data access, recovery from system crashes) • High-level abstractions for data access, manipulation, and administration • Data integrity and security • Performance and scalability

  35. Integrity Constraints • Integrity Constraints (ICs): Condition that must be true for any instance of the database. • ICs are specified when schema is defined. • ICs are checked when relations are modified. • A legal instance of a relation is one that satisfies all specified ICs. • DBMS should only allow legal instances. • Example: Domain constraints.

  36. Security • Secrecy: Users should not be able to see things they are not supposed to. • E.g., A student can’t see other students’ grades. • Integrity: Users should not be able to modify things they are not supposed to. • E.g., Only instructors can assign grades. • Availability: Users should be able to see and modify things they are allowed to.

  37. Why Database Management Systems? • Benefits • Transactions (concurrent data access, recovery from system crashes) • High-level abstractions for data access, manipulation, and administration • Data integrity and security • Performance and scalability

  38. DBMS and Performance • Efficient implementation of all database operations • Indexes • Query optimization • Automatic high-performance concurrent query execution, query parallelization

  39. Summary Of DBMS Benefits • Transactions • ACID properties, concurrency control, recovery • High-level abstractions for data access • Data models • Data integrity and security • Key constraints, foreign key constraints, access control • Performance and scalability • Parallel DBMS, distributed DBMS, performance tuning

  40. The Three-Tier Architecture Client Program (Web Browser) Presentation tier Middle tier Application Server Data management tier Database Management System

  41. Presentation Tier • Primary interface to the user • Needs to adapt to different display devices (PC, PDA, cell phone, voice access?)

  42. Middle Tier • Application Programs: • Create and process forms • Create and transmit queries • Create and process reports • Execute application logic: implement complex actions, maintain state between different steps of a workflow • Access different data management systems

  43. Database Management Tier • One or more standard database management systems: Oracle, DB2, SQL Server, MySQL

  44. Example 1: Airline reservations Build a system for making airline reservations • Database System • Application Server • Client Program

  45. Example 1: Airline reservations Build a system for making airline reservations • Database System • Airline info, available seats, customer info, etc. • Application Server • Logic to make reservations, cancel reservations, add new airlines, etc. • Client Program • Log in different users, display forms and human readable output

  46. Three-Tier Architecture: Advantages • Heterogeneous systems • Tiers can be independently maintained, modified, and replaced • Thin clients • Only presentation layer at clients (web browsers) • Integrated data access • Several database systems can be handled transparently at the middle tier • Central management of connections • Scalability • Replication at middle tier permits scalability of business logic • Software development • Code for business logic is centralized • Interaction between tiers through well-defined APIs: Can reuse standard components at each tier

  47. Technologies Client Program (Web Browser) HTML, Javascript, XSLT XML, C#, Cookies, XPath, web services Application Server SQL, Stored Procedures Database Management System

  48. Next: Microsoft Access • DBMS + Application Server

  49. Relational DB => “relate tables” Tables are related by “keys” which uniquely identify a record in a table

More Related