Understanding Relational Databases: Structure and Query Languages

Chapter 3: Relational Model • Structure of Relational Databases • Query Languages • Tuple Relational Calculus • Domain Relational Calculus

Basic Structure • Formally, given sets D1, D2, …. Dn a relation r is a subset of D1 x D2 x … x DnThus a relation is a set of n-tuples (a1, a2, …, an) where ai Di • Example: if customer-name = {Jones, Smith, Curry, Lindsay}customer-street = {Main, North, Park}customer-city = {Harrison, Rye, Pittsfield}Then r = { (Jones, Main, Harrison), (Smith, North, Rye), (Curry, North, Rye), (Lindsay, Park, Pittsfield)} is a relation over customer-name x customer-street x customer-city

Attribute Types • Each attribute of a relation has a name • The set of allowed values for each attribute is called the domain of the attribute • Attribute values are (normally) required to be atomic, that is, indivisible • E.g. multivalued attribute values are not atomic • E.g. composite attribute values are not atomic • The special value null is a member of every domain • The null value causes complications in the definition of many operations • we shall ignore the effect of null values in our main presentation and consider their effect later

Relation Schema • A1, A2, …, Anare attributes • R = (A1, A2, …, An ) is a relation schema E.g. Customer-schema = (customer-name, customer-street, customer-city) • r(R) is a relation on the relation schema R E.g. customer (Customer-schema)

Relation Instance • The current values (relation instance) of a relation are specified by a table • An element t of r is a tuple, represented by a row in a table attributes customer-name customer-street customer-city Jones Smith Curry Lindsay Main North North Park Harrison Rye Rye Pittsfield tuples customer

Relations are Unordered • Order of tuples is irrelevant (tuples may be stored in an arbitrary order) • E.g. account relation with unordered tuples

Database • A database consists of multiple relations • Information about an enterprise is broken up into parts, with each relation storing one part of the information E.g.: account : stores information about accountsdepositor : stores information about which customer owns which account customer : stores information about customers • Storing all information as a single relation such as bank(account-number, balance, customer-name, ..)results in • repetition of information (e.g. two customers own an account) • the need for null values (e.g. represent a customer without an account) • Normalization theory (Chapter 7) deals with how to design relational schemas

The customer Relation

The depositor Relation

Keys • Let K  R • K is a superkeyof R if values for K are sufficient to identify a unique tuple of each possible instance of relation r(R) • Example: {customer-name, customer-street} and {customer-name} are both superkeys of Customer, if no two customers can possibly have the same name. • K is a candidate key if K is minimalExample: {customer-name} is a candidate key for Customer, since it is a superkey {assuming no two customers can possibly have the same name), and no subset of it is a superkey.

Banking Example (keys underlined) branch (branch-name, branch-city, assets) customer (customer-name, customer-street, customer-city) account (account-number, branch-name, balance) loan (loan-number, branch-name, amount) depositor (customer-name, account-number) borrower (customer-name, loan-number)

Query Languages • Relational Algebra • Domain Relational Calculus • Tuple Relational Calculus • SQL • QBE • 1,2,3 are equivalent---same expressive power. • A query language is said to be Relationally Complete if it is at least as powerful as those • SQL and QBE are relationally complete • Relational completeness is much less than Turing completeness

Domain Relational Calculus • A nonprocedural query language equivalent in power to the tuple relational calculus • Each query is an expression of the form: {  x1, x2, …, xn  | P(x1, x2, …, xn)} • x1, x2, …, xn represent domain variables • P represents a formula similar to that of the predicate calculus

Some Queries branch (branch-name, branch-city, assets) customer (customer-name, customer-street, customer-city) account (account-number, branch-name, balance) loan (loan-number, branch-name, amount) depositor (customer-name, account-number) borrower (customer-name, loan-number) • Find the branch-name, loan-number, and amount for loans of over $1200 • Find the names of all customers who have a loan of over $1200 • Find the names of all customers who have a loan from the Perryridge branch and the loan amount • Find the names of all customers having a loan, an account, or both at the Perryridge branch • Find the names of all customers who have an account at all branches located in Brooklyn.

Example Queries • Find the branch-name, loan-number, and amount for loans of over $1200 { l, b, a  |  l, b, a   loan  a > 1200} • Find the names of all customers who have a loan of over $1200 { c  |  l, b, a ( c, l   borrower   l, b, a   loan  a > 1200)} • Find the names of all customers who have a loan from the Perryridge branch and the loan amount: { c, a  |  l ( c, l   borrower  b( l, b, a   loan  b = “Perryridge”))} or { c, a  |  l ( c, l   borrower   l, “Perryridge”, a   loan)}

Example Queries • Find the names of all customers having a loan, an account, or both at the Perryridge branch: { c  |  l ({ c, l   borrower  b,a( l, b, a   loan  b = “Perryridge”))   a( c, a   depositor  b,n( a, b, n   account  b = “Perryridge”))} • Find the names of all customers who have an account at all branches located in Brooklyn: { c  |  n ( c, s, n   customer)   x,y,z( x, y, z   branch  y = “Brooklyn”)   a,b ( a, x, b   account   c,a   depositor)}

x p(x)  x (p (x) ) • Find the names of all customers who have an account at all branches located in Brooklyn is equivalent to: • Find the names of all customers s.t. there is no branch in Brooklyn they do not have an account there: { c  |  n ( c, s, n   customer)   x,y,z( x, y, z   branch  y = “Brooklyn”)    a,b ( a, x, b   account   c,a   depositor))}

Safety of Expressions {  x1, x2, …, xn  | P(x1, x2, …, xn)} is safe if all of the following hold: 1. All values that appear in tuples of the expression are values from dom(P) (that is, the values appear either in P or in a tuple of a relation mentioned in P). 2. For every “there exists” subformula of the form  x (P1(x)), the subformula is true if an only if P1(x) is true for all values x from dom(P1). 3. For every “for all” subformula of the form x (P1 (x)), the subformula is true if and only if P1(x) is true for all values x from dom (P1).

Tuple Relational Calculus • A nonprocedural query language, where each query is of the form {t | P (t) } • It is the set of all tuples t such that predicate P is true for t • t is a tuple variable, t[A] denotes the value of tuple t on attribute A • t r denotes that tuple t is in relation r • P is a formula similar to that of the predicate calculus

Banking Example • branch (branch-name, branch-city, assets) • customer (customer-name, customer-street, customer-city) • account (account-number, branch-name, balance) • loan (loan-number, branch-name, amount) • depositor (customer-name, account-number) • borrower (customer-name, loan-number) Queries: Find • account-number, branch-name, balance where balance <100 • Cities where these people live • Names of people who have accounts at Perryridge • Names of people who accounts in all branches • People who do not have accounts at Perryridge

Example Queries • Find the names of all customers having a loan from the Perryridge branch, and the cities they live in {t | s  loan(s[branch-name] = “Perryridge”  u  borrower (u[loan-number] = s[loan-number]  t [customer-name] = u[customer-name])   v  customer (u[customer-name] = v[customer-name]  t[customer-city] = v[customer-city])))}

Example Queries • Find the names of all customers who have an account at all branches located in Brooklyn: {t |  c  customer (t[customer.name] = c[customer-name])   s  branch(s[branch-city] = “Brooklyn”   u  account ( s[branch-name] = u[branch-name]   s  depositor ( t[customer-name] = s[customer-name]  s[account-number] = u[account-number] )) )}

Safety of Expressions • It is possible to write tuple calculus expressions that generate infinite relations. • For example, {t |  t r} results in an infinite relation if the domain of any attribute of relation r is infinite • To guard against the problem, we restrict the set of allowable expressions to safe expressions. • An expression {t | P(t)}in the tuple relational calculus is safe if every component of t appears in one of the relations, tuples, or constants that appear in P

Understanding Relational Databases: Structure and Query Languages

Understanding Relational Databases: Structure and Query Languages

Presentation Transcript

Relational Model

Chapter 3: Relational Model

Chapter 5: Relational Model and Normalization

Chapter 3: Relational Model

The Relational Data Model

The Relational Model

The Relational Model

Unit 3 The Relational Model

Chapter 5:

Chapter 5

Chapter 3: Relational Model

Chapter 3: Relational Model

Chapter 5 Relational Model Concepts

CHAPTER 3 THE RELATIONAL MODEL

Chapter 3 The Relational Database Model

Chapter 3: Relational Model

Chapter 2: Relational Model

Chapter 3

Chapter 2

Chapter 3

The Relational Model