240 likes | 339 Views
The rest of Ch. 5 – Relational Algebra Ch. 10 – Logical query languages. Combining Operations to Form Queries. Construct more complex expressions by applying operators to sub-expressions Use parentheses to indicate operands grouping Multiple ways to write equivalent queries
E N D
The rest of Ch. 5 – Relational Algebra • Ch. 10 – Logical query languages
Combining Operations to Form Queries • Construct more complex expressions by applying operators to sub-expressions • Use parentheses to indicate operands grouping • Multiple ways to write equivalent queries • Expression tree for visualizing complex expression • Query optimizer • Example (ref. Fig. 5.8): title,year(length≥100(Movie) studioName=‘Fox’(Movie)) title,year(length≥100 ANDstudioName=‘Fox’(Movie)) or
Algebraic Laws • Associative: e.g. R (S T) = (R S) T • Commutative: e.g. R S = S R
DependentandIndependent Operations • Set operations:union, difference,intersection • Remove parts of a relation: • Selection – eliminates rows (tuples) • Projection – eliminates columns • Combine tuples of two relations :cartesian productandjoins • Renaming: changes relation schema (i.e. relation name, and/or attribute names)
Constraints on Relations Two ways to express constraints in relational algebra: • R = Ø // R Ø • R S // R - S = Ø where R and S are relational algebra expressions
Referential Integrity Constraints • Assertion: a value appearing in one context also appears in another, related context. • Example: Movie(title, year, length, inColor, studioName, producerC#) MovieExec(name, address, cert#, networth) Constraint: the producer of every movie is a certified movie executive, i.e. appear in the MovieExec relation producerC#(Movie) cert#(MovieExec) producerC#(Movie) - cert#(MovieExec) = Ø or
Other Constraints • Domain constraints example: MovieStars(name, address, gender, birthdate) Constraint: acceptable values for the “gender” attribute are ‘F’ or ‘M’ gender’F’ AND gender‘M’(MovieStar) = Ø • Other constraints example: MovieExec(name, address, cert#, networth) Studio(name, address, presC#) Constraint: president of a movie studio must have a net worth of at least $10,000,000 networth<10000000(Studio presC#=cert# MovieExec) = Ø • Functional dependency constraints
Relational Operations on Bags • Bag • a “set” that is allowed to have more than one occurrence of an element • => duplicate tuples in a relation • Constraint representations work with bags • Reason: • For implementation efficiency when duplication is acceptable • When actual no. of tuples is needed for aggregate • Example:
Relational Operations of Bags (continue) Given: R and S are bags, and tuple t appears in R n times, and in S m times • R S : contains n + m tuple t • R – S : contains max(0, n-m) tuple t • R S : contains min(n, m) tuple t • A,B(R) : each tuple is processed independently, resulting duplicate tuples are not eliminated • c(R) : apply the selection condition to each tuple independently, resulting duplicate tuples are not eliminated
Product and Joins of Bags Given: R and S are bags, and tuple r appears in R m times, and tuple s appears in S n times • R x S : the resulting tuple rs will appear mn times. • R S : each tuple of R is compared to each tuple of S to decide if the pair tuples joins successfully, do not eliminate duplicates • R CS : each tuple of R is compared to each tuple of S to decide if the condition C is met, do not eliminate duplicates
Extended Operation to Relational Algebra • Duplicate elimination : to convert a bag to a set • Aggregation: count, sum, max, min, average • Grouping • Extended Projection • Sorting • Outerjoins
OUTERJOINS • Dangling tuples: tuples that failed to match any tuple of the other relation in the common attributes. • An operator to augment the result of a join by the dangling tuples, padded with null values. • R S : Full outerjoin of R1 and R2 is a join that includes all rows from R1 and R2 matched or not. Unmatched rows are padded with special null symbols . • LEFT outerjoin of R1 and R2 is a join that includes all rows from R1, matched or not, plus the matching values from R2. Unmatched rows are padded with . • RIGHT outerjoin of R1 and R2 is a join that includes all rows from R2, matched or not, plus the matching values from R1. Unmatched rows are padded with . • The joining may be NATURAL or theta join
Outer Join Example R S Full Outer Join Natural join
Extensions to the Relational Model • Modifications : insert, delete, update • Views : relational expression with a name to be applied real relations to produce the relation defined by the expression. Views can used as arguments to other expressions. • Null values : common interpretations: • Value unknown • Value inapplicable • Value withheld
Ch. 10 Logical Query Languages • Motivation • Datalog • Relational Algebra to Datalog • Recursion in Datalog • Negation in Recursive Rules
Motivation • Logical rules is more natural in representing recursive queries • Logical rules form the basis of many information-integration applications
A Datalog rule example • Relation: Movie(title, year, length, inColor, studioName, producerC#) LongMovie(t, y) Movie(t, y, l, c, s, p) AND l 100 head subgoals body LongMovie = title, year(length≥100(Movie))
Datalog rule • Relational Atoms : predicate followed by arguments • Arithmetic Atoms : comparison between two arithmetic expressions (e.g. x ≠ Y) • Predicate = relation name or arithmetic comparison predicates (e.g. =, <, ≠, etc) • Head – a relational atom • Body – one or more atoms (subgoals) connected by AND • Subgoals (not head) may be optionally negated by NOT • Local variables – variables in body, not in head
Datalog • A logic based data model • The underlying mathematical model of data is essentially that of the relational model • Predicate symbols denote relations • Relational algebra operations are described by rules • Query : a collection of one or more rules. • The relation in the rule head is the answer to the query
Extensional and Intensional Predicates • Extensional Predicates (EDB) The set of relations which ARE defined as part of the actual database (i.e. physically stored). e.g. R = {1} • Intensional Predicates (IDB) The set of relations which are NOT defined as part of the actual database but are instead abstracted from logical rules. e.g. P (x) Q (x) Q (x) R (x) • A predicate must be IDB or EDB but not both. • IDB predicate can appear in the body or head of a rule • EDB predicate can appear in the rule body only
3 Different Interpretations of Logical Rules • Proof-Theoretic Interpretaton • Model-Theoretic Interpretation • Computational Interpretation The following discussions will use the EDB: R = {1}
Proof-Theoretic Interpretation As axioms to be used in a proof. • From the facts in the database, see what other facts can be proved using the rules in all possible ways. • All facts derivable using the rules are derivable by applying the rules in the forward direction only • Example: P = {1}, Q = {1}
Model-Theoretic Interpretation As definition of possible worlds or models. • To be a model, an interpretation must make the rules true, no matter what assignment of values is made for the variables in each rule. • Multiple models are possible • With no negations, a unique minimal model exists that gives the same result as the proof-theoretic interpretation. • Minimal model : cannot make any true fact false and still have a model consistent with the EDB • Example: • P = {1, 2, 3}, Q = {1, 2} • P = {1}, Q={1}
Computational Interpretation By providing an algorithm for “executing” the rules to determine whether a potential fact is true or false. • E.g. Prolog – uses a particular algorithm that involves searching for proofs of the potential fact. • Drawback: • the set of facts Prolog finds a proof is not necessarily the same as the set of all facts for which a proof exists • The set of facts Prolog finds true is not necessarily a model.