180 likes | 197 Views
Explore strategies for updating XML views efficiently, balancing correctness, translation strategies, and schema-driven approaches for a comprehensive solution. Experiment results show promising performance gains over traditional methods.
E N D
HUX: Handling Updates in XML Ling Wang, Elke A. Rundensteiner, Murali Mani and Ming Jiang DataBase Systems Research Group Departmet of Computer Science Worcester Polytechnic Institute, Worcester, MA 01609, USA {lingw|rundenst|mmani|jiangm}@cs.wpi.edu
Virtual XML Views RDB XML XML XML RDB OODB Citi Bank GoogleMap Amazon American Airline Protein Sequence Database (PSD) Time Magazine Updating XML Views for Data Integration Onlinebilling Google map e-ticket Shopping Biologist News ? ?
View * * Region Regionnew Examples Region Nation View * Region * R.regionkey=N.regionkey Nation
Challenges • Update Translatability Checking • Does at least one correct translation exist? • If yes, what are the candidate translations? • If not, where the view side effect could happen? • Update Translation Strategy • What are the correct translations? • How to find the correct translations? • Which one is the best translation?
(2) u V (4) u(V) (1) View Query View Query U(D) D (3) U Criteria of correct translations Accept Update if there exists a correct translation • View side-effect free • No “extra” updates Option 1: Face with unexpected side effects Option 2: Expensive Rollback to fix problems Otherwise
Naive Approach -- Pure Data-driven Check • Pure data-driven check • Guarantee “safe” • Guarantee “complete” • But: • Very inefficient for XML view updating • Data examination for all view nodes Core Idea: When updating a view element, base tuples that contribute to other view elements should remain untouched.
View Update Valid Invalid Uncertain Schema Untranslatable Translatable Data Untranslatable Translatable HUX I: Exploiting Schema Knowledge • Schema-driven translatable For every update on any element of the schema node: There is at least one correct translation • Schema-driven untranslatable For every update on any element of the schema node: There does not exist any correct translation
A O O S D View Elements Classification • Classify schema nodes into Self, Ancestor, Descendent, Others • No side effects on any element of SADO schema node • View side-effects Classification • When updating a relation r to update a view element vei: • r may contribute to the existence ofvej , vej = vej • (2) j may get deleted. j is a relation that refers to r and contribute to vej, vej = vei
View R.regionkey=N.regionkey * * Region Regionnew * Nation N.nationkey=C.nationkey * Customer * C.customerkey=N.customerkey Orders * O.orderkey=LI.orderkey LineItem Exploiting Schema Knowledge (Con’d) Core Idea: When updating a view element, relations that contribute to other view elements should remain untouched. Can we delete a view element whose schema node is.. LineItem ? Nation ? Regionnew ? Pros: Efficiency: use schema knowledge only Cons: Conservative: always assume the worst case
View Update Valid Invalid Uncertain Schema Untranslatable Translatable Data Untranslatable Translatable HUX II: Schema-directed Data-driven Check • data-driven translatable For the given update on an element of the schema node: There is at least one correct translation • data-driven untranslatable For the given update on an element of the schema node: There does not exist any correct translation
View R.regionkey=N.regionkey * * Region Regionnew * Nation N.nationkey=C.nationkey * Customer * C.customerkey=N.customerkey Orders * O.orderkey=LI.orderkey LineItem Schema-directed Data-driven Check (Con ’d) Core Idea: When updating a view element, base tuples that contribute to other view elements should remain untouched. Can we delete a view element whose schema node is Regionnew ? Not Pure Data-driven Check! Only check the part where schema check cannot perform
UntranslatableError Message View Query Valid User Update Query Fail STAR: Schema-driven Translatability Reasoning Annotated Schema Graph Generator Success ASG Uncertain SDC: Schema-directed Data Checking Fail XML/RDB Schema Translatable Translatable SQL Update Generator HUX SQL Updates Data Storage Oracle DB2 SQL-Server Sybase HUX: Handling Updates in XML
View R.regionkey=N.regionkey * * Region Regionnew * Nation N.nationkey=C.nationkey * Customer * C.customerkey=N.customerkey Orders * O.orderkey=LI.orderkey LineItem Experiments: HUX vs. Data-based Data check Schema untranslatable Schema translatable • TPCH Benchmark • Data-based XML view updating (DXVU) Extend the relational view update system [CWW2000] to perform XML view check • Observation • (1) Schema cases: HUX performance much better than DXVU. Great performance gain! (2) Data cases: Two systems are comparable.
Contribution • Proposed theoretical foundation for XML view updates. • Proposed schema-centric view updating algorithm. • Proved the correctness and completeness. • Implemented as an extension of Rainbow query engine. • Demonstrated efficiency through experiments.
Rainbow Projecthttp://davis.wpi.edu/~dsrg/raindbow/ • Recent Publications L. Wang, E. A. Rundensteiner, Murali Mani and Ming Jiang. HUX: A Schemacentric Approach for Updating XML Views. In CIKM, 2006 to appear L. Wang, E. A. Rundensteiner and Murali Mani. UFilter: A Lightweight XML View Update Checker. In ICDE, 2006 L. Wang, E. A. Rundensteiner, and M. Mani. Updating XML Views Published Over Relational Databases: Towards the Existence of a Correct Update Mapping. In DKE Journal, 2005. L. Wang, S. Wang, B. Murphy, and E. A. Rundensteiner. Order Sensitive XQuery Processing over Relational Sources: An Algebraic Approach. In IDEAS, 2005. L. Wang and E. A. Rundensteiner. On the Updatability of XQuery Views Publised over Relational Data. In ER, pages 795–809, 2004.
LOCNR LOCN LOC LO View L * Region * R.regionkey=N.regionkey Nation N.nationkey=C.nationkey * Customer * C.customerkey=N.customerkey Orders * O.orderkey=LI.orderkey LineItem HUX vs. Relational View Update System • Data-driven relational view update (RVU) system [CWW2000] Best case --- find translation at first probe. Worst case --- find it at the last probe. • HUX: Only schema-driven check • Observation: HUX is better than RVU even for the best case.
Experiments • HUX vs. Non-guaranteed (unsafe) system • HUX vs. relational view update system • HUX vs. pure data-driven XML view update system • Performance of HUX • Usefulness of HUX (user study) • Experimental set up • TPCH benchmark • Stop criteria: First Correct Translation (FCT) • Exhaustive search criteria: Find All Correct Translations (ACT) • Parameters: • Database size • Key and foreign keys • Element to be deleted in the view • View size