140 likes | 253 Views
CS240A: Databases and Knowledge Bases Recursive Queries in SQL 2003. Notes From Chapter 9 of Advanced Database Systems by Zaniolo, Ceri, Faloutsos, Snodgrass, Subrahmanian and Zicari Morgan Kaufmann, 1997. Carlo Zaniolo Department of Computer Science University of California, Los Angeles.
E N D
CS240A: Databases and Knowledge BasesRecursive Queries in SQL 2003 Notes From Chapter 9 of Advanced Database Systems by Zaniolo, Ceri, Faloutsos, Snodgrass, Subrahmanian and Zicari Morgan Kaufmann, 1997 Carlo Zaniolo Department of Computer Science University of California, Los Angeles
Datalog Patterns assemby(PART, SUBPART) .. %left-linear all(Major, Minor) <- all (Major, Mid), assembly(Mid, Minor). %right-linear all(Major, Minor) <- assembly(Major, Mid), all(Mid, Minor).
The Part/Subparts Exampleassembly(PART, SUBPART, QTY) CREATE RECURSIVE VIEW all_subparts(Major, Minor) AS SELECT PART SUBPART FROM assembly UNION SELECT all.Major assb.SUBPART FROM all_subparts AS all, assembly AS assb WHERE all.Minor= assb.PART • This is called a recursive union. • We have the union of an Exit Select and a Recursive Select • This definition corresponds to left-linear recursive Datalog rules • Materialization of the recursive view from the previous Example SELECT * FROM all_subparts
The WITH Construct • The WITH construct provides another way, and a more direct one to express recursion in SQL:1999 • Query: Find the parts using top tube WITH RECURSIVE all_super(Major, Minor) AS (SELECT PART, SUBPART FROM assembly UNION ALL SELECT assb.PART, all.Minor FROM assembly AS assb, all_super AS all WHERE assb.SUBPART = all.Major ) SELECT * FROM all_subpartsWHERE Minor = 'top_tube‘ • this corresponds to a right-linear Datalog rule
Implementation CREATE RECURSIVE VIEW all_subparts(Major, Minor) AS SELECT PART SUBPART FROM assembly UNION ALL SELECT all.Major assb.SUBPART FROM all_subparts AS all, assembly AS assb WHERE all.Minor= assb.PART • To implement the differential fixpoint improvement we replace the recursive relation all_subparts in the FROM clause by all_subparts, where all_subparts contains the new tuples generated in the previous iteration of differential fixpoint Algorithm. (But also eliminate duplicates as they are generated)
Find the parts using top tube(on RightLinear Recursion) WITH RECURSIVE all_super(Major, Minor) AS (SELECT PART, SUBPART FROM assembly UNION ALL SELECT assb.PART, all.Minor FROM assembly AS assb, all_super AS all WHERE assb.SUBPART = all.Major) SELECT * FROM all_super WHERE Minor = 'top tube‘ Add Minor= 'top tube', to the WHERE clauses in exit select and the recursive select: WITH RECURSIVE all_super(Major, Minor) AS (SELECT PART, SUBPART FROM assembly WHERE SUBPART = 'top tube' UNION ALL SELECT assb.PART, all.Minor FROM assembly AS assb, all_super AS all WHERE assb.SUBPART = all.Major AND all.Minor = 'top tube‘) SELECT * FROM all_super
Implementation left-recursion and right-recursion SELECT * FROM_all subparts WHERE Minor = 'top tube' Expressed against the virtual view of CREATE RECURSIVE view all_subparts(Major, Minor) AS SELECT PART SUBPART FROM assembly UNION SELECT all.Major AS assb.SUBPART FROM all_subparts AS all, assembly AS assb WHERE all.Minor= assb.PART The addition of the condition Minor = 'top tube' to the recursive select would not produce an equivalent query. Thus, the compiler transforms the recursive select into its rightlinear equivalent , then adds Minor ='top tube' to the WHERE clause.
Example from Chamberlin’s book WITH trips(destination, route, nsegs, totalcost) AS ((SELECT destination, CAST(destination AS Varchar(20)), 1, cost FROM flights WHERE origin = 'SFO') UNION ALL (SELECT f.destination, CAST(t.route || ', ' || f.destination AS Varchar(20)), t.nsegs + 1, t.totalcost + f.cost FROM trips t, flights f WHERE t.destination = f.origin AND f.destination <> 'SFO' AND f.origin <> 'JFK' AND t.nsegs < 3 )) SELECT route, totalcost FROM trips WHERE destination = 'JFK' AND totalcost = (SELECT min(totalcost) FROM trips WHERE destination = 'JFK');
Limitations of Stratification • Aggregates are nonmonotonic • They cannot be used inside recursion—stratification condition for aggregates • Without aggregates we cannot express optimal graph-traversal algorithm • Much research work in nonmonotonic reasoning. • Naïve extensions do not work: e.g. Russell’s paradox—who shaves the barber?
Recursive Rule Processing in SQL • In general the compilation techniques usable for such transformations are basically those previously described for Datalog. Including the magic set transformation. • Also stratification w.r.t. negation and aggregates* is required in SQL:1999 • Mutual recursion is supported and this where WITH becomes indispensable. ---------* Because aggregates are nonmonotonic!
Recursive Rule Processing in SQL • In general, the compilation techniques usable for such transformations are basically those previously described for Datalog---Including the magic set transformation • Also stratification w.r.t. `negation’ and aggregates is required by SQL:1999. • By negation, we mean NOT EXIST and EXCEPT because conditions such as NOT(A>B) become A<=B. • That means that you cannot express several graph algorithms in SQL:1999
Discussion • Datalog contributions • Power (beyond SQL-2) • Rule-based programming paradigm very effective in many applications • Simple Formal semantics • Novel Technology • Significant impact on SQL:1999 (SQL:2003 a clean up version of SQL:1999) • Recursive queries now supported in commercial DBMSs • Limited used in applications, because hard to write and also restrictions imposed by stratification.
Discussion (cont.) Recursion add expressive power to SQL but • Complex objects not supported: (Deductive Object-Oriented DBs—DOODs)... As opposed to OO-DB research that instead wants to unify DBs and object-oriented PL. • The Non-Monotonic conundrum: • Not a new problem: naïve set theory (Russell’s) paradox, AI Circumscription (J. Carthy) • Stratification is not sufficient, in many cases to: • ensure termination, and • Express classical graph algorithms (using extrema aggregates). • Significant progress achieved in this area not part of SQL standards.
Discussion (cont.) • Datalog remains a great framework for formal database research: E.g. • Integration Information, • Data Mining (Inductive Databases) • Network routing algorithms • Static program analysis.