360 likes | 522 Views
CS145 Introduction. Robert Ikeda Aditya Parameswaran. Overview of CS145. Introduction to Databases Design of databases Use of database management systems Topics include… Relational model, relational algebra, SQL, transactions, views, XML Not DBMS implementation (CS245, CS346) .
E N D
CS145 Introduction Robert Ikeda Aditya Parameswaran
Overview of CS145 • Introduction to Databases • Design of databases • Use of database management systems • Topics include… Relational model, relational algebra, SQL, transactions, views, XML • Not DBMS implementation (CS245, CS346)
Class Info • Instructors • Robert Ikeda Office Hours: Mon and Tues 3:15-4:00, Gates 437 • Aditya Parameswaran Office Hours: Wed and Thurs 3:15-4:00, Gates 424 • Class Website • cs145.stanford.edu • Lectures • Tuesday and Thursday 1:15-3:05pm, Nvidia Aud. • Questions • cs145-sum1112-staff@lists.stanford.edu
Prerequisites • Recommended • CS103 (Mathematical Foundations of Computing) • CS107 (Computer Organization and Systems) • Assume students are already proficient in Unix and Java
Textbook No textbook absolutely required The closest text for the course is A First Course in Database Systems, Third Edition by Ullman and Widom. You may prefer Database Systems: The Complete Book, Second Edition (also used in CS245)
Exams • All enrolled students must attend exams on campus at the scheduled time and place (Nvidia Auditorium) • Midterm Exam • Thursday, July 19 in class (1:15pm-3:05pm) • Final Exam • Saturday, August 18, 12:15-2:15pm
Grading Assignments: 30% Midterm: 35% Final: 35%
Assigned Work • Automated quizzes and exercises • Challenge problems • Programming project • Online auction system called AuctionBase • Modeled roughly after eBay
Late Policy • 11:59pm on the due date is always the deadline • Automated quizzes and exercises • No credit for late work • Highest score before the deadline • Other assigned work • Submitted electronically • Late penalty • Less than 24 hours: 10% • Less than 48 hours: 30% • No credit for work more than 48 hours late • Last submission before late deadline is only one used
Late Policy: Emergencies • Total of 4 unpenalized late days • Cannot be used for automated quizzes and exercises • Applied automatically in best way • Reminder: No assignment may be turned in more than two days late
Honor Code • All work to be done independently • Assistance (human or otherwise) • Must be indicated on all submitted work • Any assistance without proper citation violates Honor Code
Introduction to Databases
Database Management Systems (DBMS) • Used by websites, corporations, and scientists • Provide… … efficient, reliable, convenient, and safe multi-user storage of and access to massive amounts of persistent data.
Desirable DBMS Properties • Massive – terabytes • Persistent – data outlives application • Safe – hardware, software failures • Multi-user – concurrency control • Convenient – high-level query languages • Efficient – thousands of queries/updates per sec. • Reliable – high uptime
Data Model • Data model – description of how the data is structured • Relational: Set of records • XML: Hierarchical tree
Key Concepts • Schema vs. instance • Schema - structural description of data • Instance – actual data contents • Data definition language (DDL) • Sets up the schema • Data manipulation or query language (DML) • Querying and modifying data
Key People • DBMS implementer • Builds system • Database designer • Establishes schema • Database application developer • Writes programs that operate on database • Database administrator • Loads data, optimizes performance
The Relational Model • Used by major commercial database systems • Very simple model • Query with high-level languages: simple yet expressive • Efficient implementations
A Relation is a Table Attributes (column headers) Tuples (rows) IDname GPA 123 Amy 3.9 234 Bob 3.4 345 Craig 3.6 Relation name Student
Schemas • Relation schema = relation name and attribute list. • Optionally: types of attributes. • Example: Student(ID, name, GPA) or Student(ID: int, name: string, GPA: float) • Database= collection of relations. • Database schema = set of all relation schemas in the database.
Why Relations? Very simple model. Oftenmatches how we think about data. Abstract model that underlies SQL, the most important database language today.
Example • Schema – structural description of relations in database • Instance – actual contents at given point in time Student College
NULL • NULL– special value for “unknown” or “undefined” • GPA > 3.5 OR GPA <= 3.5 Student College
Key • Key– attribute whose value is unique in each tuple • Or set of attributes whose combined values are unique Student College
Creating Relations (Tables) in SQL • Create Table Student(ID, name, GPA) • Create Table College • (name string, state char(2), enr integer)
Queries Ad-hoc queries in high-level language • All students with GPA > 3.7 applying to Stanford and MIT only • All engineering departments in CA with < 500 applicants • College with highest average accept rate over last 5 years • Some easy to pose; some a bit harder • Some easy for DBMS to execute efficiently; some harder
Query Language Properties Compositional= Queries can be applied on query results Closed= Query results are again elements of the data model
Query Languages • Relational Algebra – formal • SQL – actual/implemented IDs of students with GPA > 3.7 applying to Stanford
Query Languages • Relational Algebra – formal pID(sGPA>3.7 college=‘Stanford’ (Student ⋈ Apply)) • SQL – actual/implemented IDs of students with GPA > 3.7 applying to Stanford Select Student.ID From Student, Apply Where Student.ID=Apply.ID And GPA>3.7 and college=‘Stanford’
XML Extensible Markup Language (XML) • Standard for data representation and exchange • Document format similar to HTML • Tags describe content instead of formatting • Also streaming format
Example: an XML Document Basic constructs • Tagged elements (nested) • Attributes • Text <?xml version = “1.0” ?> <Bookstore> <Book Price=“85”> <Title>Intro to Databases</Title> <Remark>Buy now!</Remark> <Authors> <Author> <First_Name>Jeffrey</First_Name> <Last_Name>Ullman</Last_Name> </Author> </Authors> </Book> <Book> … </Bookstore>
“Well-Formed” XML Adheres to basic structural requirements • Single root element • Matched tags, proper nesting • Unique attributes within elements