1 / 18

Hwan-Seung Yong (hsyong@ewha.ac.kr) (with Wol Young Lee, Won Kim) Database Lab., Ewha Womans Univ.

No Path Queries for XML. Hwan-Seung Yong (hsyong@ewha.ac.kr) (with Wol Young Lee, Won Kim) Database Lab., Ewha Womans Univ. Movie data. America. France. Leon. country. title. Natalie Portman. Jean Reno. 1994. actor. year. action. drama. Luc Besson. director. genre.

Download Presentation

Hwan-Seung Yong (hsyong@ewha.ac.kr) (with Wol Young Lee, Won Kim) Database Lab., Ewha Womans Univ.

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. No Path Queries for XML Hwan-Seung Yong (hsyong@ewha.ac.kr) (with Wol Young Lee, Won Kim) Database Lab., Ewha Womans Univ.

  2. Movie data America France Leon country title Natalie Portman Jean Reno 1994 actor year action drama Luc Besson director genre Background • XML Data Model allows users to represent various structures • Therefore we can make more documents with same data than other data model • For instance, let’s suppose that we make XML documents with movie data such as…

  3. movie year country country genre genre title director actor actor 1994 America France drama action Leon Luc Besson Jean Reno Natalie Portman genre type country name year drama yyyy movie America 1994 title director actor Leon Luc Besson Jean Reno Background <movie> <year>1994</year> <country>America</country> <country>France</country> <genre>drama</genre> <genre>action</genre> <title>Leon</title> <director>Luc Besson</director> <actor>Jean Reno</actor> <actor>Natalie Portman</actor> </movie> <genre type=“drama”> <country name=“America”> <yearyyyy=“1994”> <movie> <title>Leon</title> <director>Luc Besson</director> <actor>Jean Reno</actor> </movie> … </country> … </genre>…

  4. (a) <movie> <year>1994</year> <country>America</country> <country>France</country> <genre>drama</genre> <genre>action</genre> <title>Leon</title> <director>Luc Besson</director> <actor>Jean Reno</actor> <actor>Natalie Portman</actor> </movie> (d) <movie> <general_info> <year>1994</year> <country>America</country> <country>France</country> <genre>drama</genre> <genre>action</genre> </general_info> <detail_info> <title>Leon</title> <director>Luc Besson</director> <actor>Jean Reno</actor> <actor>Natalie Portman</actor> <detail_info> </movie>… (b) <movie> <year yyyy=“1994”></year> <country name=“America France”></country> <genre type=“drama action”></genre> <title name=”Leon”></title> <director name=”Luc Besson”></director> <actor name=”Jean Reno Natalie Portman”></actor> </movie> (e) <year yyyy=”1994”> <country name=“America”> <genre type=“drama”> <movie> <title>Leon</title> <director>Luc Besson</director> <actor>Jean Reno</actor> </movie> … </country> … </year> (c) <movie> <year>1994</year> <country><name>America</name> <name>France</name></country> <genre><type>drama</type> <type>action</type></genre> <title>Leon</title> <director><name>Luc Besson</name></director> <actor><name>Jean Reno</name> <name>Natalie Portman</name></actor> </movie>

  5. Background • This makes it inconvenient for users to query XML documents • For example… • Q1: Search title information of which genre is ‘action’, year is ‘1994’ and actor is ‘Jean Reno’ on documents representing movie information. • We have to know a path among the names besides given data names such as genre, year, actor, and title and their values in order to query in existing XML query languages • 1. Genre=“action”AND year=“1993” AND actor=“Tommy Lee Jones” • 2. genre=“action//@year=“1993”//actor=“Tommy Lee Jones” • 3. genre/type=“action”AND year/yyyy=“1993” AND actor/name=“Tommy Lee Jones” • 4. … ??……

  6. Objective • We think that users have to be able to query without regard to structures on various documents like previous Figures and get the movie title (Leon) from the all documents if users know data names and their values. • Also it will be better if users can query using one expression on the many documents at the same time and get wanted results at a time • We will address supporting queries without regarding to the structures of these many documents.

  7. genre : “action” AND year : “1993” AND actor : “Tommy Lee Jones” Data names : element/attribute name Data values Approach • 1. Design a semi-natural query expression called X-NPE(No Path Expression for XML) • Example)Search movie title of which genre is “action”, year is “1993”, and actor is “Tommy Lee Jones” Structure??

  8. All possible paths Approach • 2. A query processor has to drive paths among the names because X-NPE expresses only data names and values. • For this work, we classify all possible paths among the names according to types and extract factors we have to resolve to process X-NPE genre : “action” AND year : “1993” AND actor : “Tommy Lee Jones”

  9. Path m-HEA Path d-FE-FA Path d-FA-FE Path m-FE Path m-FEA Path m-HE C1A C1E C1E C1E CnE iE C1A CnA iE V2 C1E iA CnE iA CmA CkE CmA V1 CkE C1A C1E V1 … CnE … … … V1 Vn Vn V2 Vn-1 Vn-1 Vn V1 V1 Vn Vn V1 Vn Path d-lHE-FE Path d-ulHE-FE Path d-uHE-FE Path d-uHE-HE Path d-lHE-HE Path d-FE-FEA Path d-FEA-FEA C1E C1E iE iE CnE C1E … iE iE V1 iE iE CnE iA iE CnA C1E iE C1A CnE … C1E CnE … … iE iE CnE C1E … V1 CnE iE V1 Vn iE iE V1 Vn V1 Vn Vn V1 Vn Vn Path d-ulHE-HE Vn V1 Path d-HE-HA Path d-HA-HE Path d-HE-HEA Path d-HEA-FE C1E Path d-HEA-HE Path d-HEA-HEA iE C1A iE C1A C1E C1E iA iE iE iE C1A iE CnA V1 iA iE V1 CnE V1 iE V1 iE CnE V1 CnE CnE iA V1 Vn C1E CnE iA Vn iA Vn CnE iA iE … Vn Vn V1 Vn Vn

  10. Approach:Apply Region Numbering Scheme • 3. In order to quickly search all possible paths, we assign unique identifiers to each node on XML documents and develop an algorithm capable of processing it. <year yyyy=”1994”> <country name=“America”> <movie > <title>Leon</title> <genre type=“drama” type=”action”></genre> <people> <director>Luc Besson</director> <actor> <name>Jean Reno</name> <name>Natalie Portman</name> </actor> </people> </movie> …. <country name=”France”> …. </year>

  11. 10,1000 20,25 30,490 500,990 title 20,25 40,45 510,515 50,170 Leon 510,515 40,45 70,75 80,110 120,180 70,75 90,95 100,105 130,135 140,170 Doc-ID Start-Region End-Region name 90,95 150,155 160,165 130,135 100,105 1 10 220 movie 1 20 30 year 1 40 110 Basic-info 1 120 210 people 150,155 160,165 Region Numbered Tree and Table

  12. Prototype System Diagram 4. Implement an XML-server to evaluate the performance of X-NPE queries

  13. Example Data Movie data

  14. Compare X-NPE with path-based queries

  15. Compare X-NPE with path-based queries

  16. Example Queries

  17. Performance evaluation (a) Non-hierarchical Structure/ one given data (b) Non-hierarchical Structure/ two given data (c) Hierarchical Structure/ two given data (d) Non-hierarchical & Hierarchical Structure/ two given data

  18. Conclusion • Propose Ad-hoc XML Query with No Path for XML • Similar to Relational Model after Network/Hierarchical Model • Enumerate all kind of XML structures representing Same information • Evaluate our approach through Prototype Impementation • Future work is to apply XML Web Search Engine

More Related