620 likes | 865 Views
OCL3 Oracle 10g: SQL & PL/SQL Session #1. Matthew P. Johnson CISDD, CUNY June, 2005. Personnel. Instructor: Matthew P. Johnson mpjohnson-at-gmail.com TA: Ratna Priya Moganti rmoganti7-at-yahoo.com Admin: Dawn Kleinberger dkleinberger-at-gc.cuny.edu. Communications. Web page:
E N D
OCL3 Oracle 10g:SQL & PL/SQLSession #1 Matthew P. Johnson CISDD, CUNY June, 2005 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005
Personnel • Instructor: Matthew P. Johnson • mpjohnson-at-gmail.com • TA: Ratna Priya Moganti • rmoganti7-at-yahoo.com • Admin: Dawn Kleinberger • dkleinberger-at-gc.cuny.edu Matthew P. Johnson, OCL3, CISDD CUNY, June 2005
Communications • Web page: • http://pages.stern.nyu.edu/~mjohnson/oracle/ • syllabus • course policies • reading assignments • etc. Matthew P. Johnson, OCL3, CISDD CUNY, June 2005
Acknowledgements • Thanks to Ramesh at NYU, Ullman, et al., Raghu and Johannes, Dan Suciu, Arthur Keller, David Kuijt for course materials • See classpage for other related, antecedent DBMS courses Matthew P. Johnson, OCL3, CISDD CUNY, June 2005
What is a Database? • A very large, integrated collection of data. • Models real-world enterprise. • Entities • students, courses, instructors, TAs • Relationships • George is currently taking OCL • Dick is currently teaching OCL • Condi is currently TA-ing OCL but took it last semester • Database Management System (DBMS): large software package designed to store and manage databases Matthew P. Johnson, OCL3, CISDD CUNY, June 2005
Databases are everywhere: ordering a pizza • Databases involved? • Pizza Hut’s DB • stores previous orders by customer • stores previous credit cards used • Credit card records • huge databases of (attempted) purchases • location, date, amount, parties • Got approved by credit-report companies • phone company’s records • Local Usage Details (“Pull his LUDs, Lenny.”) • Caller ID • ensures reported address matches destination Matthew P. Johnson, OCL3, CISDD CUNY, June 2005
Your wallet is full of DB records • Driver’s license • Credit cards • Medical insurance card • Social security card • Gym membership • Individual checks • Dollar bills (w/serial numbers) • Maybe even photos (ids on back) Matthew P. Johnson, OCL3, CISDD CUNY, June 2005
Databases are everywhere • Q: Websites backed by DBMSs? • retail: Amazon, etc. • data-mining: “Page You Made” • search engines: Google, etc. • directories: Internic, etc. • searchable DBs: IMDB, tvguide.com, etc. • Q: Non-web examples of DBMSs? • airline bookings • criminal/terrorist: TIA • NYPD’s CompStat • all serious crime stats by precinct • Retailers: Wal-Mart, etc. • when to re-order, purchase patterns, data-mining • Genomics! Matthew P. Johnson, OCL3, CISDD CUNY, June 2005
Example of a Traditional DB App Suppose we are building a system to store the information about: • checking accounts • savings accounts • account holders • state of each of each person’s accounts Matthew P. Johnson, OCL3, CISDD CUNY, June 2005
Can we do it without a DBMS? Sure we can! Start by storing the data in files: checking.txt savings.txt customers.txt Now write C or Java programs to implement specific tasks Matthew P. Johnson, OCL3, CISDD CUNY, June 2005
Doing it without a DBMS... • Transfer $100 from George’s savings to checking: Write a C program to do the following: • Read savings.txt • Find&update the record “George” • balance -= 100 • Write savings.txt • Read checking.txt • Find&update the record “George” • balance += 100 • Write checking.txt Matthew P. Johnson, OCL3, CISDD CUNY, June 2005
Problems without an DBMS... 1. System crashes: • Q: What is the problem ? • A: George lost his $100 • Same problem even if reordered 2. Simultaneous access by many users • George and Dick visit ATMs at same • Lock checking.txt before each use–what is the problem? • Read savings.txt • Find&update the rec “George.” • Write savings.txt • Read checking.txt • Find&update the rec “George” • Write checking.txt CRASH ! Matthew P. Johnson, OCL3, CISDD CUNY, June 2005
Problems without an DBMS... 3. Large data sets (say 100s of GB or TBs) • Why is this a problem? • No indices • Finding “George” in huge flatfile is expensive • Modifications intractable without better data structures • “George” “Georgie” is very expensive • Deletions are very expensive Matthew P. Johnson, OCL3, CISDD CUNY, June 2005
Problems without a DBMS... 5. Security? • File system may be insecure • File system security may be coarse 6. Application programming interface (API)? • suppose need other apps to access DB 7. How to interact with other DBMSs? Matthew P. Johnson, OCL3, CISDD CUNY, June 2005
General problems to solve • In building our own system, many Qs arise: • how do we store the data? (file organization, etc.) • how do we query the data? (write programs…) • make sure that updates don’t mess things up? • leave the DB “consistent” • provide different views on the data? • e.g., ATM user’s view v. bank teller’s view • how do we deal with crashes? • Too hard! Go buy Oracle! • Q: How does a DBMS solve these problems? • A: Long story; see other courses/books Matthew P. Johnson, OCL3, CISDD CUNY, June 2005
Big issue: Transaction processing • Grouping of several queries (or other database actions) into one transaction • ACID properties • Atomicity • all or nothing • Consistency • constraints on relationships • Isolation • concurrency control • Simulated solipsism • Durability • Crash recovery Matthew P. Johnson, OCL3, CISDD CUNY, June 2005
Atomicity & Durability • Saw how George lost $100 with makeshift software • A DBMS prevents this outcome • xacts are all or nothing • One idea: Keep a log (history) of all actions in set of xacts • Durability: Use log to redo or undo certain ops in crash recovery • Atomicity: don’t really commit changes until end • Then, all at once Matthew P. Johnson, OCL3, CISDD CUNY, June 2005
Isolation • Concurrent execution is essential for performance. • Frequent, slow disk accesses • don’t waste CPU – keep running • Interleaving actions of different user programs • can lead to inconsistency: • e.g., two programs simultaneously withdraw from the same account • DBMS ensures such problems don’t arise: • users can pretend they are using a single-user system Matthew P. Johnson, OCL3, CISDD CUNY, June 2005
Isolation • Contrast with a file in two Notepads • Strategy: ignore multiple users • whichever saves last wins • first save is overwritten • Contrast with a file in two Words • Strategy: blunt isolation • One can edit • To the other it’s read-only Matthew P. Johnson, OCL3, CISDD CUNY, June 2005
Consistency • Each xant (on a consistent DB) must leave it in a consistent state • can define integrity constraints • checks the defined claims about the data remain true Matthew P. Johnson, OCL3, CISDD CUNY, June 2005
Data Models • Every DBMS uses some data model: collection of concepts for describing data • Schema: description of partic set of data, using some data model • Relational data model: most widely used (by far) data model • Oracle, DB2, SQLServer, other SQL DBMSs • main concept: relation ~ table of rows & columns • a rel’s schema defines its fields Matthew P. Johnson, OCL3, CISDD CUNY, June 2005
Example: university database • Conceptual schema: • Students(ssn: string, name: string, login: string, age: int, gpa: real) • Courses(cid: string, cname: string, credits: int) • Enrolled(sid:string, cid:string, grade: string) • Physical schema: • Relations stored as unordered text files. • Indices on first column of each rel • External Schema (View): • Course_info(ssn: string, name: string) • My_courses(cname: string, grade: string) Matthew P. Johnson, OCL3, CISDD CUNY, June 2005
How the programmer sees the DBMS • Start with DDL to create tables: • Continue with DML to populate tables: CREATE TABLE Students ( Name CHAR(30), SSN CHAR(9) PRIMARY KEY NOT NULL, Category CHAR(20) ); INSERT INTO Students VALUES('Howard', '123456789', 'undergraduate'); Matthew P. Johnson, OCL3, CISDD CUNY, June 2005
How the programmer sees the DBMS • Tables: • Still implemented as files, but behind the scenes can be quite complex Takes: Students: Courses: “data independence” = separate logical view from physical implementation Matthew P. Johnson, OCL3, CISDD CUNY, June 2005
Querying: Structured Query Language • Find all the students who have taken OCL2: • Find all the students who OCL2 last fall: • Find the students’ names: • Query processor does this efficiently SELECT SSN FROM Takes WHERE CID='OCL2'; SELECT SSN FROM Takes WHERE CID='OCL2' AND Semester='Fall, 2003' SELECT Name FROM Students, Takes WHERE Students.SSN=Takes.SSN AND CID='OCL2' AND Semester='Fall, 2003'; Matthew P. Johnson, OCL3, CISDD CUNY, June 2005
Database Industry • Relational databases are a great success of theoretical ideas • based on most “theoretical” type of math there is: set theory • DBMS companies are among the largest software companies in the world • Oracle, IBM (with DB2), Microsoft (SQL Server, Microsoft Access), Sybase • Also opensource: MySQL, PostgreSQL, etc. • $20B+ industry • XML (“semi-structured data”) also important • New lingua franca for exchanging data Matthew P. Johnson, OCL3, CISDD CUNY, June 2005
Databases are used by • DB app programmers • desktop app programmers • web developers • Database administrators (DBAs) • design schemas • security/authorization • crash recovery • tuning • better paid than programmers! • Everyone else (perhaps indirectly) • “You may not be interested in databases, but databases are interested in you.” - Trotsky Matthew P. Johnson, OCL3, CISDD CUNY, June 2005
The Study of DBMS • Several aspects: • Modeling and design of databases • DBMS programming: querying and update • DBMS implementation • This course covers the first two • Also will look at some more advanced areas • XML, data-warehousing, regexps Matthew P. Johnson, OCL3, CISDD CUNY, June 2005
Course outline • Two biggest topics: • SQL • PL/SQL • But also: • Database design: • Entity/Relationship models • Modeling constraints • The relational model: • Relational algebra • Transforming E/R models to relational schemas Matthew P. Johnson, OCL3, CISDD CUNY, June 2005
Outline (Continued) • SQL views and triggers • Connecting to a Oracle from programming languages • Web apps • Data warehousing • XML • May change as course progresses • partly in response to audience Matthew P. Johnson, OCL3, CISDD CUNY, June 2005
Textbook • Oracle Database 10g PL/SQL 101 • by Christopher Allen • Hardcover: 416 pages • Publisher: McGraw-Hill/ Osborne Media • ISBN: 0072255404 • 1st edition (August 10, 2004) • Distributed in class Matthew P. Johnson, OCL3, CISDD CUNY, June 2005
SQL Readings • Optional reference: Oracle PL/SQL Programming • Online (free) SQL tutorials include: • A Gentle Introduction to SQL (http://sqlzoo.net/) • SQL for Web Nerds (http://philip.greenspun.com/sql/) Matthew P. Johnson, OCL3, CISDD CUNY, June 2005
On-going Feedback • Don’t wait until the class is over to give feedback on improving it • too late for you then! • Send mail if you have questions or concerns • “We’re in touch, so you be in touch.” Matthew P. Johnson, OCL3, CISDD CUNY, June 2005
So what is this course about, really? A bit of everything! • Languages: SQL, XPath, XQuery • Data modeling • Some theory! • Functional dependencies, normal forms • e.g., how to find most efficient schema for data • Writing lots of SQL queries • Lots of coding in PL/SQL • Business DBMS examples/cases • Most importantly: how to meet real-world needs Matthew P. Johnson, OCL3, CISDD CUNY, June 2005
For right now: additional written survey • Email to mpjohnson-at-gmail.com: • name • email • previous cs/is/math/logic courses/background • previous programming experience • Perl? • PHP? • HTML? • Job: programmer, DBA, etc. • why taking class Matthew P. Johnson, OCL3, CISDD CUNY, June 2005
Agenda • Last time: intro, RDBMS, ACID test • This time: E/R model • Identify entity sets, relations and attributes • One-one, one-many, many-many relations • Simple ER diagrams to model a situation • 3-way relationships; Converting to binary • Entities with multiple roles • Subclasses • Design issues • Principles of faithfulness & simplicity in ER diagrams • Redundancy • Whether an element should be an attribute or entity set • Replacing a relationships with entity sets Matthew P. Johnson, OCL3, CISDD CUNY, June 2005
Relational schema Relational DB E/R design DB development path the World Matthew P. Johnson, OCL3, CISDD CUNY, June 2005
Entity/Relationship (E/R) Model • A popular data model – useful to database designers • Graphical representation of miniworld • Helps design the database, not implement it • E/R design is translated to a relational design • relational design then implemented in an RDBMS • Elements of model • Entities • Entity Sets • Attributes • Relationships (!= relations!) Matthew P. Johnson, OCL3, CISDD CUNY, June 2005
Elements of E/R Model: Entity Sets • Entity: like an object • e.g. President Bush • Particular instance of a concept • Entity set: set of one sort of entities or a concept • e.g. World leaders • Generally, same set of attributes • Represented by a rectangle • A “good” entity set – you decide • Common properties • Correspond to class of phys. or bus. objects • (People, products, accounts, grades, etc.) World Leader Matthew P. Johnson, OCL3, CISDD CUNY, June 2005
Elements of E/R Model: Attributes • Properties of entities in entity set • Like fields in a struct • Like columns in a table/spreadsheet • Like data members in an object • Values in some domain (e.g., ints, strings) • Represented by ovals: • Assumed atomic • But could have limited structure • Ints, strings, etc. ID Name Student Matthew P. Johnson, OCL3, CISDD CUNY, June 2005
Elements of E/R Model: Relationships • Connect two or more entity sets • e.g. students enroll in courses • Binary relationships: connect two entity sets – most common • Multiway relationships: connect several entity sets • Represented by diamonds Students Enroll Courses Matthew P. Johnson, OCL3, CISDD CUNY, June 2005
Elms of E/R Model: Rel’ships (cont’d) • Students Enroll in courses • Courses are Held in rooms • The E/R data model: Students Enroll Courses ID Name Held Rooms Matthew P. Johnson, OCL3, CISDD CUNY, June 2005
A little set theory • A mathematical set =a collection of members • A set is defined by its members • “Are you in or are you out?” • No other structure, no order, no duplicates allowed • Sets can be specified by listing: • {1, 2, 3, …} = N • {1, 2, George Bush} (few applications, but valid) • Or by “set-builder” notation: • { x in N: 2 divides x} = ? • { x in Presidents | reelected(x)} = ? • {2x: x in N} = ? Matthew P. Johnson, OCL3, CISDD CUNY, June 2005
A little set theory • One set can be a subset of another (which is then a superset of it) • ReelectedPresidents is a subset of Presidents • Also, RP is a proper subset of Pres – some lost reelection • Given two sets X and Y, the cross product or Cartesian product is X x Y = {(x,y): x in X, y in Y} = the set of all ordered pairs in which the first comes from X and the second comes from Y • Important: (x,y) != {x,y} • In an order pair or tuple • Order matters • Duplicates are allowed Matthew P. Johnson, OCL3, CISDD CUNY, June 2005
A little set theory • Mathematically, a relation(ship) between X and Y is just a subset of X x Y = all those pairs (x,y) s.t. x is related to y • Example: owner-of O on People, Cats • O(MPJ, Gödel) holds • The equals relation E on N, N: • E(3,3) holds because 3 = 3 • E(3,4) does not hold • E is still a set: E = {(1,1), (2,2), (3,3), …} • Father of relation F on People, People: • F(GHWB, GWB) holds • F(GWB, GHWB) does not hold • Relations aren’t necessarily symmetric Matthew P. Johnson, OCL3, CISDD CUNY, June 2005
Many-many Many-one One-one Multiplicity of Relation(ship)s Representation of relationships • No arrow: many-to-many • Sharp arrow: many-to-one • Rounded arrow: “exactly one” • “key constraint” • One-one: Matthew P. Johnson, OCL3, CISDD CUNY, June 2005
Students Enrolls Courses Student Live Residence hall Student Live Residence hall Multiplicity of Relation(ship)s Many to one: a student lives in <= 1 residence hall Many-to-many: Many to exactly one: a student must live in a residence hall Matthew P. Johnson, OCL3, CISDD CUNY, June 2005
Multiplicity, set-theoretically • Assume no vars below are equal • Many-one means: • if (x1,y1) in R then (x1,y2) cannot be in R • One-many means: • if (x1, y1) in R then (x2,y1) cannot be in R • One-one means: • if (x1,y1) in R, then neither (x2,y1) nor (x1,y2) can be in R • Notice: one-one is stronger than many-one • One-one implies both many-one and one-many Matthew P. Johnson, OCL3, CISDD CUNY, June 2005
E/R Diagram e.g. Name Name Enrolls Students Courses ID ID Assisting TA Name ID Matthew P. Johnson, OCL3, CISDD CUNY, June 2005
E/R Diagrams • Works if each TA is a TA of all students • Student and TA connected only through Course • But what if students were divided among multiple TAs? • Then a student in OCL3 would be related to only one of the TA's for OCL3—which one? • Schema doesn’t store enough info • 3-way relationship is helpful here Matthew P. Johnson, OCL3, CISDD CUNY, June 2005