1 / 44

Containment of Partially Specified Tree-Pattern Queries

Containment of Partially Specified Tree-Pattern Queries. Dimitri Theodoratos (NJIT, USA) Theodore Dalamagas (NTUA, GREECE) Pawel Placek (NJIT, USA) Stefanos Souldatos (NTUA, GREECE) Timos Sellis (NTUA, GREECE).

MartaAdara
Download Presentation

Containment of Partially Specified Tree-Pattern Queries

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Containment of Partially Specified Tree-Pattern Queries Dimitri Theodoratos (NJIT, USA) Theodore Dalamagas (NTUA, GREECE) Pawel Placek (NJIT, USA) Stefanos Souldatos (NTUA, GREECE) Timos Sellis (NTUA, GREECE)

  2. IntroductionData ModelAdditional ConceptsQuery ContainmentExperimentsConclusion

  3. r GREECE USA ATHENS YAMAHA BMW HONDA YAMAHA BMW ON-OFF TRAVEL TRAVEL ON-OFF TRAVEL 200cc F650GS 650cc VARADERO 200cc 650cc SERROW F650GS F650 NJ 125cc 1000cc SERROW Motivating Example () • Tree structure (e.g. XML) with motorbike spare parts. • We search for spare parts. • BUT… Stefanos Souldatos - HDMS 2006

  4. r GREECE USA ? ATHENS YAMAHA BMW HONDA YAMAHA BMW ON-OFF TRAVEL TRAVEL ON-OFF TRAVEL 200cc F650GS 650cc VARADERO 200cc 650cc SERROW F650GS F650 NJ 125cc 1000cc SERROW Motivating Example () • Dimitri Theodoratos lives in NJ. • He has a Yamaha Serrow motorbike in Greece. • He searches for spare parts in Greece or USA.  structural difference Stefanos Souldatos - HDMS 2006

  5. r GREECE USA ATHENS YAMAHA BMW HONDA YAMAHA BMW ON-OFF TRAVEL TRAVEL ON-OFF TRAVEL 200cc F650GS 650cc VARADERO 200cc 650cc SERROW F650GS F650 NJ 125cc 1000cc SERROW Motivating Example () • Theodore Dalamagas has a BMW motorbike. • He looks for spare parts worldwide. structural inconsistency ../650cc/F650GS ../F650GS/650cc Stefanos Souldatos - HDMS 2006

  6. r GREECE USA ATHENS YAMAHA BMW HONDA YAMAHA BMW ON-OFF TRAVEL TRAVEL ON-OFF TRAVEL 200cc F650GS 650cc VARADERO 200cc 650cc SERROW F650GS F650 NJ 125cc 1000cc SERROW Motivating Example () • Stefanos Souldatos has a Honda Varadero. • But, he is not fully aware of the tree structure.  unknown structure Stefanos Souldatos - HDMS 2006

  7. r r r GREECE GREECE GREECE USA USA USA ATHENS ATHENS ATHENS YAMAHA YAMAHA YAMAHA BMW BMW BMW HONDA HONDA HONDA YAMAHA YAMAHA YAMAHA BMW BMW BMW ON-OFF ON-OFF ON-OFF TRAVEL TRAVEL TRAVEL TRAVEL TRAVEL TRAVEL ON-OFF ON-OFF ON-OFF TRAVEL TRAVEL TRAVEL 200cc 200cc 200cc F650GS F650GS F650GS 650cc 650cc 650cc VARADERO VARADERO VARADERO 200cc 200cc 200cc 650cc 650cc 650cc SERROW SERROW SERROW F650GS F650GS F650GS F650 F650 F650 NJ NJ NJ 125cc 125cc 125cc 1000cc 1000cc 1000cc SERROW SERROW SERROW Motivating Example () • Pawel Placek wants to buy a motorbike that he can easily find spare parts for. • He searches in many different tree structures. source integration Stefanos Souldatos - HDMS 2006

  8. Motivation Querying tree-structured data BUT structure is not always strictly defined  user does not always deal with structure:  Find Honda spare parts in Greece. Stefanos Souldatos - HDMS 2006

  9. IntroductionData ModelAdditional ConceptsQuery ContainmentExperimentsConclusion

  10. R C L E B M T r GREECE USA ATHENS YAMAHA BMW HONDA YAMAHA BMW ON-OFF TRAVEL TRAVEL ON-OFF TRAVEL 200cc F650GS 650cc VARADERO 200cc 650cc SERROW F650GS F650 NJ 125cc 1000cc SERROW Dimension Graph dimension graph = summary of the tree structure DIMENSIONS R (oot) C (ountry) L (ocation) B (rand) T (ype) M (odel) E (ngine) Stefanos Souldatos - HDMS 2006

  11. R C = {Greece} C L B = {BMW} B = {BMW} E B M = ? E = ? M T Partially Specified Tree-pattern Query • Query: Find shops with spare parts for all models and all engines of BMW motorbikes in Greece. (+ structural info) DIMENSIONS R (oot) C (ountry) L (ocation) B (rand) T (ype) M (odel) E (ngine) Stefanos Souldatos - HDMS 2006

  12. R C = {Greece} C L B = {BMW} B = {BMW} E B M = ? E = ? M T PSP p1 PSP *p2 Partially Specified Tree-pattern Query • Query: Find shops with spare parts for all models and all engines of BMW motorbikes in Greece. (+ structural info) DIMENSIONS R (oot) C (ountry) partially specified paths (PSP) L (ocation) B (rand) T (ype) M (odel) E (ngine) Stefanos Souldatos - HDMS 2006

  13. R C = {Greece} C L B = {BMW} B = {BMW} E B M = ? E = ? M T PSP p1 PSP *p2 Partially Specified Tree-pattern Query • Query: Find shops with spare parts for all models and all engines of BMW motorbikes in Greece. (+ structural info) DIMENSIONS R (oot) C (ountry) output path (*) partially specified paths (PSP) L (ocation) B (rand) T (ype) M (odel) E (ngine) Stefanos Souldatos - HDMS 2006

  14. R C = {Greece} C L B = {BMW} B = {BMW} E B M = ? E = ? M T PSP p1 PSP *p2 Partially Specified Tree-pattern Query • Query: Find shops with spare parts for all models and all engines of BMW motorbikes in Greece. (+ structural info) parent child ancestor descendant DIMENSIONS R (oot) C (ountry) output path (*) partially specified paths (PSP) L (ocation) B (rand) T (ype) M (odel) E (ngine) Stefanos Souldatos - HDMS 2006

  15. R C = {Greece} C L B = {BMW} B = {BMW} E B M = ? E = ? M T PSP p1 PSP *p2 Partially Specified Tree-pattern Query node sharing expression (NSE) • Query: Find shops with spare parts for all models and all engines of BMW motorbikes in Greece. (+ structural info) parent child ancestor descendant DIMENSIONS R (oot) C (ountry) output path (*) partially specified paths (PSP) L (ocation) B (rand) T (ype) M (odel) E (ngine) Stefanos Souldatos - HDMS 2006

  16. IntroductionData ModelAdditional ConceptsQuery ContainmentExperimentsConclusion

  17. C = {Greece} C = {Greece} B = {BMW} B = {BMW} M = ? E = ? PSP p1 PSP *p2 Additional Concepts Full Form Query Stefanos Souldatos - HDMS 2006

  18. R C = {Greece} C = {Greece} R C L B = {BMW} B = {BMW} C = {Greece} E B B = {BMW} M = ? E = ? M T T PSP p1 PSP *p2 E M Additional Concepts Full Form Query Dimension Trees DIMENSION TREES = QUERY + GRAPH Stefanos Souldatos - HDMS 2006

  19. IntroductionData ModelAdditional ConceptsQuery ContainmentExperimentsConclusion

  20. Absolute Containment Each result of Q1 is a result of Q2.  Q1  Q2 Stefanos Souldatos - HDMS 2006

  21. Absolute Containment Each result of Q1 is a result of Q2.  Q1  Q2 homomorphism from Q2 to Q1 Stefanos Souldatos - HDMS 2006

  22. C C C C B B M M E E Absolute Containment Each result of Q1 is a result of Q2.  Q1  Q2 homomorphism from Q2 to Q1 Q1 Q2 PSP *p1 PSP p2 PSP *p3 PSP p4 Stefanos Souldatos - HDMS 2006

  23. Relative Containment (w.r.t. G) Each result of Q1 in G is a result of Q2 in G.  Q1 G Q2 Stefanos Souldatos - HDMS 2006

  24. Relative Containment (w.r.t. G) Each result of Q1 in G is a result of Q2 in G.  Q1 G Q2 homomorphism from the Dimension Trees of Q2 to the Dimension Trees of Q1 Stefanos Souldatos - HDMS 2006

  25. R R C C B B T T M E E Relative Containment (w.r.t. G) Each result of Q1 in G is a result of Q2 in G.  Q1 G Q2 homomorphism from the Dimension Trees of Q2 to the Dimension Trees of Q1 A dimension tree of Q1 A dimension tree of Q2 Stefanos Souldatos - HDMS 2006

  26. Relative Containment Heuristic 100msec Relative Containment (RC) 1msec Absolute Containment (AC) Stefanos Souldatos - HDMS 2006

  27. Relative Containment Heuristic  sound but not complete • extract structural information from the Dimension Graph • insert it in the query Q1 • check Q1  Q2 instead of Q1 G Q2 Relative Containment Heuristic (RCH) 100msec Relative Containment (RC) 1msec Absolute Containment (AC) Stefanos Souldatos - HDMS 2006

  28. R C L E B M T Relative Containment Heuristic • Example Q1 Q2 Q1  Q2 C = ? B = ? B = ? T = ? PSP *p1 PSP *p2 Stefanos Souldatos - HDMS 2006

  29. R C L E B M T Relative Containment Heuristic • Example Q1 Q2 B=>T : R->C, C=>B Q1  Q2 C = ? B = ? B = ? T = ? PSP *p1 PSP *p2 Stefanos Souldatos - HDMS 2006

  30. R C L E B M T Relative Containment Heuristic • Example Q1 Q2 B=>T : R->C, C=>B Q1  Q2 R = ? C = ? C = ? B = ? B = ? Q1 G Q2 T = ? PSP *p1 PSP *p2 Stefanos Souldatos - HDMS 2006

  31. IntroductionData ModelAdditional ConceptsQuery ContainmentExperimentsConclusion

  32. Experiments • We measured… • execution time for • Absolute Containment (AC) • Relative Containment (RC) • Relative Containment Heuristic (RCH) • accuracy for RCH • …for various graph sizes • …for various query sizes Stefanos Souldatos - HDMS 2006

  33. Time Graph dimensions: 30 Graph dimensions: 40 Graph dimensions: 20 RC RC RC RCH RCH RCH Time (msec) AC AC AC Graph paths: 10 - 80 Graph paths: 15 - 120 Graph paths: 20 - 160 Query PSPs: 1 Query PSPs: 2 RC RC Time (msec) RCH RCH AC AC Nodes per PSP: 3 - 6 Nodes per PSP: 3 - 6 Stefanos Souldatos - HDMS 2006

  34. Accuracy of RCH • 80% for graphs of common sizes • based on XML benchmarks (XMach, XMark, etc.) • 50% for graphs of higher density Stefanos Souldatos - HDMS 2006

  35. IntroductionData ModelAdditional ConceptsQuery ContainmentExperimentsConclusion

  36. Conclusion • Query Containment forPartially Specified Tree-Pattern Queries (PSTPQs). • Soundtechnique for checking Relative Query Containment • Time: one order of magnitude • Accuracy: over 80% Stefanos Souldatos - HDMS 2006

  37. A A B B C C PSP p1 PSP p2 PSP *p3 Future Work • Heuristics for checking Relative Containment • precomputed and on-the-fly • trade-off between time and accuracy • Special forms of queries, e.g. swings: Stefanos Souldatos - HDMS 2006

  38. Questions?

  39. Links Introduction (2-9) Data Model (10-17) Additional Concepts (18-20) Query Containment (21-32) Experiments (33-36) Conclusion (37-41) Appendix (42-46) Stefanos Souldatos - HDMS 2006

  40. Appendix

  41. Who defines thedimensions? • Automatic • XML tags (dimension graph = “path summary”, “path index”, “structural summary”) • Semi-automatic • Graph administrator + XML tags (dimension = group of XML tags) • Graph administrator + ontology • Manual • Graph administrator Stefanos Souldatos - HDMS 2006

  42. R C = {Greece} C = {Greece} C L B = {BMW} B = {BMW} E B M = ? E = ? M T PSP p1 PSP *p2 Inference Rules INFERENCE RULES (IR1) |- R[p1]  R[p2] (IR2) A[p1]  A[p2], A[p2]  A[p3] |- A[p1]  A[p3] (IR3) a structural expression that involves A[p] |- R[p] => A[p] (IR4) A[p]  B[p] |- A[p] => B[p] (IR5) A[p] => B[p], B[p] => C[p] |- A[p] => C[p] (IR6) A[p]  B[p], A[p => C[p] |- B[p] => C[p] (IR7) A[p]  B[p], C[p] => B[p] |- C[p] => A[p] (IR8) A[p1]  B[p1], B[p1]  B[p2] |- A[p2]  B[p2] (IR9) A[p1] => B[p1], B[p1]  B[p2] |- A[p2] => B[p2] (IR10) A[p1] => B[p1], A[p1]  A[p2], R[p2] => B[p2] |- A[p2] => B[p2] (IR11) A[p1] => B[p1], B[p1]  B[p2] |- A[p1]  A[p2] (IR12) A[p1]  B[p1], C[p2]  B[p2], D[p1]  D[p2] |- D[p1] => A[p1] (IR13) A[p1]  B[p1], A[p2]  C[p2], D[p1]  D[p2] |- D[p1] => A[p1] (IR14) A[p1] => B[p1], B[p2] => A[p2], C[p1]  C[p2] |- C[p1] => A[p1] 1. Full Form Query Stefanos Souldatos - HDMS 2006

  43. R R C = {Greece} C = {Greece} R C = {Greece} C L B = {BMW} B = {BMW} C = {Greece} B = {BMW} E B B = {BMW} M = ? E = ? T M T T R R PSP p1 PSP *p2 M E C = {Greece} C = {Greece} M E B = {BMW} B = {BMW} T T E M E M E M Dimension Trees r/Greece/BMW/ *T[*E]/*M r/Greece/BMW/ *T/*M [*E] r/Greece/BMW/ *T/*E/*M r/Greece/BMW/ *T[*M/*E]/*E*M Stefanos Souldatos - HDMS 2006

  44. Previous Approaches • Keyword-based search approach • Absence of structure • Naive approach • All possible query patterns are generated (Honda=>Greece, Greece=>Honda) • Approximation techniques • Relax the query  more answers • Traditional integration approach • Global structure and mapping rules Stefanos Souldatos - HDMS 2006

More Related