1 / 39

Chapter 2: Relational Algebra

Chapter 2: Relational Algebra. -- Introduction to database principles Maoying Wu ( bi203.sjtu@gmail.com ) March 6, 2013. Relation terminology. Learning strategies. Know how => know why 适当的不求甚解. After-class assignment 1. Install MySQL on your own computer

claude
Download Presentation

Chapter 2: Relational Algebra

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 2: Relational Algebra -- Introduction to database principles Maoying Wu (bi203.sjtu@gmail.com) March 6, 2013

  2. Relation terminology

  3. Learning strategies • Know how => know why • 适当的不求甚解

  4. After-class assignment 1 • Install MySQL on your own computer • Linux: from source package or LAMP • Windows: from rebuilt package or XAMP • Learn some basic knowledge on HTML/PHP and B/S development • Review of C programming skills

  5. DML (数据操作语言)Data Manipulation Language • Language for accessing and manipulating the data organized by appropriate data model • aka, query language (查询语言) • Two classes of languages • procedural (过程性): specifies what data is required and how to get these data • declarative (声明性): specifies what data is required without knowing how to get these data • SQL: most widely used query language

  6. DDL (数据定义语言)Data-defintion Language • Specified notation for defining the database schema • Example: create table account( account_no char(10), balance integer) • DDL compiler generates a set of tables stored in a data dictionary (数据字典) • Data dictionarycontains metadata (元数据) • Database schema • Data storage and definition language • specifies the storage structure and access methods (存储结构和访问方法) • Integrity constraints • domain constraints (类型约束) • referential integrity (参考完整性) • assertions (声明约束) • Authorization (权限)

  7. SQL (结构化查询语言)Structural Query Language • Principal language to describe and manipulate relational databases • SQL-99 standard • Two aspects: • Data-Definition language (DDL) for declaring database schemas • Data-Manipulation language (DML) for querying and modifying

  8. SQL: 3 kinds of relations • tables (表):relations stored in the database allowing modification as well queries • views (视图): relations defined by a computation, constructed when needed • temporary tables (临时表): constructed by SQL processor when performing queries.

  9. Relational Algebra -- Operations on relations, construct new relations from old ones

  10. Four classes of operations • set operations: union, intersection, and difference • removing: “selection” eliminates some rows (tuples), and “projection” eliminates some columns • relation combination: Cartesian product, join • renaming: changing the relation schema, i.e., the names of the attributes and/or the name of relation itself.

  11. Outlines • Procedural language (过程性语言) • Six basic operators (基本操作) • Select: σ (选择) • Project: Π (投影) • Union: ᴜ (并) • Set difference: - (差) • Cartesian product: × (笛卡尔积) • rename: ρ (重命名) • The operators take one or two relations as inputs and produce a new relation as a result

  12. Select Operator (选择) • Relation r • σ (A=B)Ʌ(D>5) (r)

  13. select operator • Notation: σ p(r) • p is called the selection predicate (选择谓词) • Defined as • p is a formula in propositional calculus (命题演算) consisting of terms connected by: ˄(and), ˅(or), ¬(not) • Each term can be one of • <attribute> op (<attribute> or <constant>) • where op can be: =, ≠, ≤, ≥, >, < • Example of selection: • σ name=“Charlie” (account)

  14. Project operator: example • Relation r: • ΠA,C(r):

  15. Project operator (投影) • Notation: ΠA1,A2,…, Ak(r) • A1, …, Ak are the attribute names and r is a relation name • The result is defined as the relation of k columns obtained by erasing the columns that are not selected • Duplicate rows are removed from result, since relation are sets • Example: To eliminate the sname attribute of relation exam: • Πsid, score(exam)

  16. union operator: example • relation r, s: • r ᴜ s:

  17. union operator • Notation: r ᴜ s • Defined as: • For union operation to be valid, • r, s must have the same arity (same number of attributes) • the attribute domains must be compatible (each column of the two operands should have the same data type) • Example: find all students with A or Not pass

  18. set difference operator • Relation r, s • r – s

  19. set difference operator • Notation: r – s • Defined as: • Set differences must be taken between compatible relations • r and s must have the same arity • attribute domains of r and s must be compatible

  20. Cartesian product - example • Relation r, s: • r × s:

  21. Cartesian product • Notation: r ×s • Defined as: • Assume that attributes of r(R) and s(S) are disjoint. That is, R Ʌ S = ø • If attributes of r(R) and s(S) are not disjoint, then rename must be used.

  22. combination of operations • a single expression can contain multiple operations • Example: • r × s • σA=C(r × s)

  23. Renaming operator • Renaming operator allows us to name, and therefore to refer to the results of relational-algebra expression (关系代数表达式) • Allows us to refer to a relation by more than one name. • Example: • ρX(E) returns the expression E under the name X • If a relational algebra expression E has arity n, then ρX(A1,A2,…, An)(E) returns the result of expression E under the name X, and with the attributes renamed to A1, A2, …, An

  24. Banking example • branch(branch_name, branch_city, assets) • customer(customer_name, customer_street, customer_city) • account(account_number, branch_name, balance) • loan(loan_number, branch_name, amount) • depositor(customer_name, account_number) • borrower(customer_name, loan_number)

  25. Examples • Find loans over $1200: • σamount>1200(loan) • Find the loan numbers for loans with amount over 1200: • Πloan_number(σamount>1200(loan)) • Find the customer names for all who have a loan, an account, or both, from the bank: • Πcustomer_name(borrower) ᴜ Πcustomer_name(depositor)

  26. Example queries • Find the names of all customers who have a loan at the Shanghai branch: Πcustomer_name(σbranch_name=“Shanghai”(σborrower.loan_number=loan.loan_number(borrower×loan))) • Find the names of all customers who have a loan at the Shanghai branch but do not have an account at any branch of the bank: Πcustomer_name(σbranch_name=“Shanghai”(σborrower.loan_number=loan.loan_number(borrower×loan)))- Πcustomer_name(depositor)

  27. Two different strategies • find the names of all customers who have a loan at the Shanghai branch (1) Πcustomer_name(σbranch_name=“Shanghai”(σborrower.loan_number=loan.loan_number(borrower×loan))) (2) Πcustomer_name(σborrower.loan_number=loan.loan_number(σbranch_name=“Shanghai”(loan))×borrower))

  28. Example queries • find the largest account balance • Strategy: • Find those balances that are not the largest (a) rename account as d so that we can compare each account balance with all others • use set difference to find those account balances that were not found in the previous step Πbalance(account) – Πaccount.balance(σaccount.balance < d.balance(account ×ρd(account)))

  29. Formal definition • A basic expression in the relational algebra consists of either one of the following: • A relation in the database • A constant relation • Let E1 and E2 be relational-algebra expressions; the following are all relational-algebra expressions: • E1∪E2 (并) • E1 – E2 (差) • E1 x E2 (笛卡尔积) • σp(E1), p is a predicate on attributes in E1 (选择) • Πs(E1), s is a list of some of the attributes in E1 (投影) • ρx(E1), X is the new name for the result of E1 (更名)

  30. Additional operations • Set intersection (交) • Natural join (自然连接) • Division (除) • Assignment (赋值) • We must keep in mind that additional operations do NOT add anything to relational algebra, but that simplify some common queries

  31. set intersection - example • Relation r, s • r∩s:

  32. Set intersection • Notation: r∩s • Defined as: • Assume: • r, s have the same arity • attributes of r and s are compatible • Note: r∩s = r – (r-s)

  33. Natural join • Notation: • Let r and s be relations on schemas R and S, respectively. Then is a relation on schema R ∪S obtained as follows: • Consider each pair of tuplestr from r and ts from s. • If tr and ts have the same value on each of the attributes in R∩S, add a tuple t to the result, where t has the same value as tr on r and t has the same value as ts on s • Example: • R= (A, B, C, D) • S= (E, B, D) • Result schema = (A, B, C, D, E)

  34. Natural join - example • Relation r, s: • :

  35. Division operator (除法) • Notation: r ÷s • Suitable for queries that include the phrase “for all” • Let r and s be relations on schemas R and S, where • R = (A1, …, Am, B1, …, Bn) • S = (B1, …, Bn) • The result of r ÷s is a relation on schema R – S = (A1, A2, …, Am) • where tu means the concatenation of tuples t and u to produce a single tuple

  36. Division example 1 • r ÷s:

  37. Division example 2 • r ÷s:

  38. Assignment operator • The assignment operator (←) provides a convenient way to expression complex queries • write query as a sequential program consisting of a series of assignments • followed by an expression whose value is displayed as a result of the query • Assignment must always be made to a temporary relation variable • Example: write r÷s as • temp1 ← ΠR-S(r) • temp2 ← ΠR-S((temp1×s) - ΠR-S,S(r) • result = temp1 – temp2

  39. Ends!

More Related