260 likes | 279 Views
Explore the algebraic approach in updating materialized views for efficient data retrieval and query optimization in data warehouses and information integration systems. Learn about XPROP and XAT for XML update propagation.
E N D
An Algebraic Approach For Incremental Maintenance of Materialized XQuery Views Maged EL-Sayed, Ling Wang, Luping Ding, and Elke A. Rundensteiner DSRG Lab, Computer Science Dept. Worcester Polytechnic Institute WIDM2002
Materialized Views Why Views? • Data warehouses • Information Integration • Information Inter-portability Why Materialized? • Speeding up data retrieval • Query optimization Focus on materialized views built using XQuery View Query RDB XML other Sources WIDM2002
Updating Materialized Views • Updates to sources are common • Views need to be maintained to be consistent with sources. • How to update views: • Re-computation • Incremental update WIDM2002
Related Work • Relational Model: [GM95], [ZGMHW95], [AASY98], [ZR00], [KR02] Data updates, schema changes, relevant updates, algebraic-based propagation, concurrent updates, extended SQL, batch updates. • XML model ARGOS [QCR01] : XQL, local cash indexes, relevant updates • Semi-structured data [AMRVW98] : Lorel query language WIDM2002
Goal Efficient update propagation for updates on XML views • To handle different updates type : • Element: insert, delete, and change • Attribute: insert, delete, and change • To handle any XML view defined by XQuery WIDM2002
Our Approach: XPROP • Uses XAT algebra[Rainbow] • Uses update primitives to update XML • Algebraic-rules for updates: • For each algebraic update • For each update primitive • Propagate only relevant updates XML View Update Algebra Tree XQuery Definition XML Source XML Source XML Source Update WIDM2002
Background: XML Algebra XAT Algebraic Operators: [Rainbow] • XML Operators: ex. Navigate • SQL Operators: ex. Select • Special Operators: ex. Function XAT Data Model: [Rainbow] • XAT table can hold: • Atomic values • Elements/attributes • Collection of elements $s2, entry $b S “reviews.xml”$s2 WIDM2002
Example <reviews> <entry> <title>Data on the Web</title> <review> A very good discussion of semi-structured database systems and XML</review> </entry> <entry> <title>Advanced Programming in the Unix environment</title> <review>A clear and detailed discussion of UNIX programming</review> </entry> <entry> <title>TCP/IP Illustrated</title> <review>One of the best books on TCP/IP</review> </entry> </reviews> <bib> <book year="1992"> <title>Advanced Programming in the Unix environment</title> <author>Darcy Gerbarg</author> <publisher>Addison-Wesley</publisher> </book> <book year="2000"> <title>Data on the Web</title> <author>Serge Abiteboul</author> <publisher>Morgan Kaufmann Publishers</publisher> </book> <book year="1994"> <title>TCP/IP Illustrated</title> <author>W. Stevens</author> <publisher>Addison-Wesley</publisher> </book> </bib> reviews.xml bib.xml WIDM2002
Retrieve book title for all books published by “Morgan Kaufmann Publishers” and their respective reviews XQuery Example <Result> FOR $a IN document("bib.xml")/book, $b IN document("reviews.xml")/entry WHERE $a/title = $b/title and $a/publisher = "Morgan Kaufmann Publishers" RETURN <Book_Review> $a/title, $b/review </Book_Review> </Result> <Result> <Book_Review> <title>Data on the Web</title> <review> A very good discussion of semi-structured database systems and XML </review> </Book_Review> </Result> Materialized View View Definition WIDM2002
From XQuery to Algebra Tree T<Result>$col13</ Result > $col14 Agg $col13 • Operators • S : Source • :Navigate J : Join T : Tagger Agg: aggregate T<Book_Review>$col11$col12</Book_Review> $col13 J (($col11 == $col5) AND ($col7 == "Morgan Kaufmann Publishers")) $col1, $col12 $b, title $col5 $a, publisher $col7 $a, title $col11 $b, review $col12 $s1, book $a $s2, entry $b S “bib.xml”$s1 S “reviews.xml”$s2 WIDM2002
Sample XAT Execution <reviews> <entry> <title>Data on the Web</title> <review> A very good discussion of semi-structured database systems and XML</review> </entry> <entry> <title>Advanced Programming in the Unix environment</title> <review>A clear and detailed discussion of UNIX programming</review> </entry> <entry> <title>TCP/IP Illustrated</title> <review>One of the best books on TCP/IP</review> </entry> </reviews> $b, title $col5 $b, review $col12 $s2, entry $b reviews.xml WIDM2002 S “reviews.xml”$s2
XPROP Update Primitives XML Update primitives (xmlup): applies to XML documents • AddAtt(att, valu, pos) • DeleteAtt(pos) • ChangeAtt(valu, pos) • AddEle(el, pos) • DeleteEle(pos) • ChangeEle(el, pos) XAT Update primitives (xatup): applies to XAT tables • InsertTuple (tup, ord) • DeleteTuple (id) • ChangeTuple (xmlup, ucol, id) WIDM2002
Update Propagation in XAT Table XAT Update Primitives: • InsertTuple (tup, ord) • DeleteTuple (id) • ChangeTuple (xmlup, ucol, id) Example: ChangeTuple (DeleteEle(Author[2]:book), $col1, 2) Xmlupucolid pos WIDM2002
Update xatup xmlup XQuery XPROP Update Primitives XML View XAT XQuery Definition XML Source XML Source XML Source WIDM2002
Update Propagation Example “Change publisher element of first book element to "Morgan Kaufmann Publishers" <Result> <Book_Review> <title>Advanced Programming in the Unix environment</title> <review>A clear and detailed discussion of UNIX programming</review> </Book_Review> <Book_Review> <title>Data on the Web</title> <review> A very good discussion of semi-structured database systems and XML</review> </Book_Review> </Result> Effect on View XAT Update to the source ChangeEle(<publisher>Morgan Kaufmann Publishers</publisher>, book[1].publisher[1] : bib) bib.xml reviews.xml WIDM2002
4 ChangeTuple(ChangeEle (<publisher> Morgan Kaufmann Publishers</publisher>, publisher[1]:book), $a, 1) 3 ChangeTuple(ChangeEle (<publisher> Morgan Kaufmann Publishers</publisher> ,publisher[1]:book), $a, 1) 2 ChangeTuple(ChangeEle (<publisher> Morgan Kaufmann Publishers</publisher> , book[1].publisher[1]:bib), $s1, 1) 1 ChangeEle (<publisher> Morgan Kaufmann Publishers</publisher>, book[1].publisher[1]:bib) Update Propagation in XAT (I) $a, publisher $col7 $b, title $col5 $a, title $col11 $b, review $col12 $s1, book $a $s2, entry $b S “bib.xml”$s1 S “reviews.xml”$s2 bib.xml reviews.xml WIDM2002
Update Propagation in XAT (II) Agg $col13 7 InsertTuple({2 ,2, <Book_Review> <title>t1</title> <review>r2</review> </Book_Review>}, 1) T<Book_Review>$col11$col12</Book_Review>$col13 6 InsertTuple({2 ,{1, 2}, <title>t1</title>, <review>t2</ review >}, 1) J(($col11 == $col5) AND $col7 == ("Morgan Kaufmann Publishers")) $col1, $col12 5 ChangeTuple( ChangeEle (<publisher> Morgan Kaufmann Publishers </publisher> ,publisher), $col7, 1) $a, publisher $col7 $b, title $col5 WIDM2002
Update Propagation in XAT (III) 10 XML View AddtEle (<Result><Book_Review>.. </Book_Review> <Result>, 1: Result) 9 <Book_Review> <title>t1</title> <review>r2</review> <Book_Review> ChangeTuple(AddtEle (<Result> <Book_Review>.. </Book_Review> <Result>, 1: Result), $col14, 1) T<Result>$col13</ Result > $col14 8 <Book_Review> <title>t1</title> <review>r2</review> <Book_Review> ChangeTuple(AddtEle( <Book_Review> <title>t1</title> <review>r2</review> </Book_Review>, 1), $col13, 1) Agg $col13 WIDM2002
System Architecture XML View Legend RAINBOW System Module Storage Data XPROP* Propagation Rules XAT Maintainer . Intermediate Materialization Update Handle XML Query Manager XML Source XML Source XML Source Update *Our system is built as an extension to the Rainbow system developed at WPI WIDM2002
Experimental Result WIDM2002
Summery • One of the first algebraic incremental view maintenance approaches for XML • Proposed a set of update primitive for updating XML documents • Handle general updates on XML • Proposed propagation rules to handle updates incrementally WIDM2002
Future Work • Use references in XAT tables • Batch updates • XAT tree rewrite • Extensive Experiments WIDM2002
Rainbow Project: http://davis.wpi.edu/dsrg/rainbow/index.html XPROP Extension: http://davis.wpi.edu/dsrg/xprop/index.html WIDM2002
Extension to XAT Table We need extra information: • Tuple ID (ID) • Tuple parent ID (P) $cols, Author $col2 $s1, book $col1 S “bib.xml”$s1 bib.xml WIDM2002
Update Position Update Position (pos): [Niagara] • Entry point part (EP) • Relative forward part (RF) Example:book[2].author[1] : bib EP can be: • element name : ex.book[2].author[1] : bib • XML fragment order: ex.author[2] : 1 • Null: ex.Null : Null RF can be: • set of elements/attributes with order: ex.book[2].author[1]: bib • Null : ex.Null: Book WIDM2002
Update Position and XAT Update position Data type in XAT Example Atomic Value Null : Null Author[1].last[1] : Book Element/attribute Collection last[1] : 2 WIDM2002