1 / 27

Relational-Style XML Query

Relational-Style XML Query. Taro L. Saito, Shinichi Morishita University of Tokyo June 10 th , SIGMOD 2008 Vancouver, Canada Presented by Sangkeun-Lee Reference slides: www.xerial.org/pub/paper/2008/leo-sigmod2008-slides.pdf. Intelligent Database Systems Lab

knoton
Download Presentation

Relational-Style XML Query

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Relational-Style XML Query Taro L. Saito, Shinichi Morishita University of Tokyo June 10th, SIGMOD 2008 Vancouver, Canada Presented by Sangkeun-Lee Reference slides: www.xerial.org/pub/paper/2008/leo-sigmod2008-slides.pdf Intelligent Database Systems Lab School of Computer Science & Engineering Seoul National University, Seoul, Korea

  2. If your Manager Says… It’s a tragedy… Center for E-Business Technology

  3. Migration to XML Database • Benefits of using XML • XML is a portable text-data format • Tree-structured XML can reduce redundancy of relational data Co Emp Emp e1 e2 Relational Data Office Office NY NY XML Data Center for E-Business Technology

  4. Problem • Querying relational data translated into XML • Q: Retrieve a node tuple (Co, Emp, Office) from the XML data • E.g. Xpath, a path expression query/Co/Emp/Office Co Emp Emp e1 e2 Relational Data Office Office NY NY XML Data Center for E-Business Technology

  5. Structure Variations • Tree-representation of relational data is not unique Co Emp Emp Office Co e1 e2 NY Office Office Office Co Emp Emp NY NY NY e1 e2 Emp Emp e1 e2 Center for E-Business Technology

  6. Inconvenience of Xpath Query • User must know the entire XML structures to produce correct path queries Co Emp Emp Office Co e1 e2 NY Office Office Office Co Emp Emp NY NY NY e1 e2 Emp Emp e2 /Co/Office/Emp //Office[Co]/Emp //Co/Emp[Office] e1 Center for E-Business Technology

  7. Relational-Style XML Query • Query relations in XML • With an SQL-like syntax • SELECT Co, Emp, Office from (XML Data) Co Emp Emp Office Input XML Co e1 e2 NY Office Office Office Co Emp Emp e2 NY NY NY Result e1 e2 Emp Emp e1 Center for E-Business Technology

  8. To Retrieve Relations in XML Center for E-Business Technology

  9. Problem Definition • Convert an SQL query, SELECT A,B,C, into an XML structure query • There can be many structural variations of (A,B,C) • For N nodes, there exists N^(N-1) structural variations • 3^2 = 9 … Center for E-Business Technology

  10. An Example This example involves various tree structures that denote data in the same relation. Center for E-Business Technology

  11. Amoeba • A node tuple (A,B,C) is an amoeba iff one of the A,B and C is a common ancestor of the others • Amoeba join retrieves all amoeba structures in the XML data … Center for E-Business Technology

  12. Relation in XML • A Key observation • Relation is simply embedded in XML Co Co Office Office NY Emp Emp NY Co Emp Emp e1 e2 Emp Emp Office Office e1 e2 e1 e2 NY NY /Co/Office/Emp //Office[Co]/Emp //Co/Emp[Office] Center for E-Business Technology

  13. Hidden Semantics in XML • Some amoeba structure may not form a relation • Why this structure is not allowed? • Because there are functional dependencies (FD) implied in the XML structure Company 1 Company M Office Office Office 1 N Employee Emp Emp Emp Emp Center for E-Business Technology

  14. Functional Dependencies (FD) • FD: X->Y (From a given X,Y is uniquely determined) • Employee -> office (Each employee belongs to an office) • Office -> company (Each office belongs to a company) • Relation in XML must have an amoeba structure corresponding to each FD • Relations and FDs are sufficient to describe a schema of XML Company 1 M Office Company Invalid structure!! 1 N Office Office Employee Emp Emp Emp Emp Center for E-Business Technology

  15. Examples of FDs Center for E-Business Technology

  16. If FDs are ignored… • The company has M offices, and each office has N employees: • # of (company, office, employee) tuples: • When M = 100, N=5 100x(100x5) • While, # of correct answers is only M*N = 500 Company Company 1 M Office Office Office Office 1 N Emp Emp Emp Emp Emp Emp Employee Center for E-Business Technology

  17. Detecting FDs • A type of FDs required to determine XML structures to query is one-to-many(or one-to-one) relationships: • FD: Emp -> Office • Each employee belongs to an office • An office may have several employees (one-to-many) • We can observe these relationships by counting node occurrences or directory from the ER-diagram Company 1 Company M Office Office Office 1 Emp Emp Emp Emp N Employee Center for E-Business Technology

  18. Avoiding Invalid Relations • However, an amoeba structure itself is a connected component of tree nodes, and thus invalid nodes may be connected, as illustrated in Figure 9. • To avoid these irrelevant node connections, while allowing various tree structures in describing XML data, we introduce a restricted class of XML structures, called a tree relation Center for E-Business Technology

  19. Query Translation using FDs • Relational query into XML query • SELECT Co, Office, Emp • (with FDs: Emp-> Office, Office -> Co) • XML structures of interests are automatically determined from a relation and functional dependencies Center for E-Business Technology

  20. FD-Aware Amoeba Join • FDs: Emp-> Office, Office -> Company • Bottom-up construction of query results • Amoeba Join (Employee, Office) • Amoeba Join (Office, Company) • FD-aware amoeba join avoids invalid XML structures Company Company 1 Office Office M Office Office 1 Emp Emp Emp Emp Emp Emp N Employee Center for E-Business Technology

  21. Query Performance • FD-aware amoeba join scales well • For various sizes of XML data Center for E-Business Technology

  22. Experiments – Query Set Center for E-Business Technology

  23. Think in Relational-Style • First, consider • XML := Relations + their annotations • Steps • Detect relational part from XML data • Detect one-to-many(one) relationships (FDs) • Write relational queries • SELECT Co, Emp, Office • Things more in the paper • XML Algebra • Data Integration • Pushing Structural Constraints for better performance • Query Incomplete Relations • Other experiments Center for E-Business Technology

  24. Contributions & Conclusions • Relation in XML • Defined using amoeba structure and FDs • XML Algebra • Relational-Style XML Query • Retrieves relations in XML with a SQL-like query syntax (SQL over XML) • Allows structural variations of XML data • Departure from path expression queries • Target XML structures are automatically determined • Applications • Database integration • Managing relational data enhanced with XML syntax • “It’s Just SQL” • A large number of XML data and queries are still relational Center for E-Business Technology

  25. Contributions & Conclusions It’s not tragedy anymore… Center for E-Business Technology

  26. Really? My thought on Center for E-Business Technology

  27. Thoughts on The Paper • Good Points • Generally, I think it’s a very good paper • The paper shows us very interesting motivation and clearly define the problem • The approach in the paper is intuitive and well-explained • The authors performed well-designed experiments • The paper proposes many future works • Maybe, one of us can work on them • However, • I doubt if it is really useful – Can we use this for an important business project? • Although the author insists that their algorithm scales well, some queries caused out of memory and didn’t work • It’s a crucial problem to be used in real-life business • Isn’t Xpath good enough? • People who use XML docs usually know the structure of XML docs Center for E-Business Technology

More Related