520 likes | 759 Views
Using Semantics in XML Data Management. Tok Wang Ling Department of Computer Science National University of Singapore Gillian Dobbie Department of Computer Science University of Auckland. Roadmap. XML documents and current XML schema languages
E N D
Using Semantics in XML Data Management Tok Wang Ling Department of Computer Science National University of Singapore Gillian Dobbie Department of Computer Science University of Auckland SWIIS, Bangkok
Roadmap • XML documents and current XML schema languages • ORA-SS(Object-Relationship-Attribute model for Semi-Structured data) [6] • The applications of ORA-SS • Semantic query optimization in XML • Conclusion [6]. T. W. Ling, M. L. Lee, G. Dobbie. Semistructured Database Design. Springer Science+Business media, Inc. 2005 SWIIS, Bangkok
Roadmap • XML documents and current XML schema languages • ORA-SS (Object-Relationship-Attribute model for Semi-Structured data) • The applications of ORA-SS • Semantic query optimization in XML • Conclusion SWIIS, Bangkok
1. XML – Brief introduction • XML (eXtensible Markup Language) is • Released by W3C • An application of SGML • A promising standard of data publishing, integrating and exchanging on the web • XML schemas • DTD (Data Type Definition) [4] • XSD (XML Schema Definition), W3C recommended standard [8, 9, 10] [4]. Extensible Markup Language (XML) 1.0 (3rd Edition). W3C Recommendation 04 February 2004. http://www.w3.org/TR/2004/REC-xml-20040204/ [8]. XML Schema Part 0: Primer Second Edition. W3C Recommendation 28 October 2004. http://www.w3.org/TR/2004/REC-xmlschema-0-20041028/ [9]. XML Schema Part 1: Structures Second Edition. W3C Recommendation 28 October 2004. http://www.w3.org/TR/2004/REC-xmlschema-1-20041028/ [10]. XML Schema Part 2: Datatypes Second Edition. W3C Recommendation 28 October 2004. http://www.w3.org/TR/2004/REC-xmlschema-2-20041028/ SWIIS, Bangkok
1. XML – A motivating example • Suppose we have an XML document “psj.xml” about different parts, suppliers and projects, where • The document has a root element psj; • Under psj, there is a sequence of part elements; • Under part, there is a sequence of supplier elements; • Under supplier, there is a sequence of project elements. SWIIS, Bangkok
Example 1. psj.xml <?xml version="1.0" encoding="UTF-8"?> <psj xmlns:xsi="…" xsi:noNamespaceSchemaLocation="…"> <part> <pno>P001</pno> <pname>Nut</pname> <color>Silver</color> <supplier> <sno>S001</sno> <sname>Alfa</sname> <city>Atlanta</city> <price>5</price> <project> <jno>J001</jno> <jname>Rocket boots</jname> <budget>20000</budget><qty>60</qty> </project> <project> <jno>J003</jno> <jname>Firework launcher</jname> <budget>250000</budget> <qty>650</qty> </project> </supplier> <supplier> <sno>S002</sno> <sname>Beta</sname> <city>Atlanta</city> <city>New York</city> <price>5.5</price> <project> <jno>J002</jno> <jname>Diving helm</jname> <budget>18000</budget> <qty>70</qty> </project> <project> <jno>J003</jno> <jname>Firework launcher</jname> <budget>250000</budget> <qty>50</qty> </project> </supplier> </part> … … <part> <pno>P002</pno> <pname>Nut</pname> <color>Copper</color> <supplier> <sno>S001</sno> <sname>Alfa</sname> <city>Atlanta</city> <price>4.6</price> <project> <jno>J002</jno> <jname>Diving helm</jname> <budget>18000</budget> <qty>60</qty> </project> </supplier> <supplier> <sno>S003</sno> <sname>Beta</sname> <city>New York</city> <price>5</price> <project> <jno>J001</jno> <jname>Rocket boots</jname> <budget>20000</budget><qty>20</qty> </project> <project> <jno>J004</jno> <jname>Blue fireworks</jname> <budget>20000</budget> <qty>50</qty> </project> </supplier> </part> </psj> Figure 1. Example XML document SWIIS, Bangkok
1. XML – the DTD of the “psj.xml” <?xml version="1.0" encoding="UTF-8"?> <!--DTD generated by XXX--> <!ELEMENT psj (part+)> <!ELEMENT part (pno, pname, color, supplier+)> <!ELEMENT pno (#PCDATA)> <!ELEMENT pname (#PCDATA)> <!ELEMENT color (#PCDATA)> <!ELEMENT supplier(sno, sname, city+, price, project+)> <!ELEMENT sno (#PCDATA)> <!ELEMENT sname (#PCDATA)> <!ELEMENT city (#PCDATA)> <!ELEMENT price(#PCDATA)> <!ELEMENT project (jno, jname, budget, qty)> <!ELEMENT jno (#PCDATA)> <!ELEMENT jname (#PCDATA)> <!ELEMENT budget (#PCDATA)> <!ELEMENT qty (#PCDATA)> ▼♦ psj ▼♦ part ♦ pno ♦ pname ♦ color ▼♦supplier ♦ sno ♦ sname ♦ city ♦price ▼♦project ♦ jno ♦ jname ♦ budget ♦qty (b) psj.dtd in Data Guide (a) “psj.dtd”, The DTD of the “psj.xml” Figure 2. DTD and DataGuide of Example XML document SWIIS, Bangkok
1. XML – what the DTD says • DTD is a simple definition of an XML document, where users can define • Element/Attribute types • Occurrence constraints (e.g. ?, +, *) • Containment among different element types (the structure) • DTD cannot express • Occurrence constraints in numbers (e.g. 2 to 8) • Uniqueness/Key constraints on a combination of attributes/elements (ID attribute can be only assigned on one attribute at a time in DTD.) • Relationshiptypes among elements and their degrees • Difference between the attribute(or simple element) of element type and the attribute (or simple element) of relationship type. Simple elements are those element types with PCDATA only without any attribute types. SWIIS, Bangkok
<xs:schema xmlns:xs = “…”> <xs:element name = “psj”> <xs:complexType> <xs:sequence> <xs:element name="part"> <xs:complexType> <xs:sequence> <xs:element name="pno" type="xs:string"/> <xs:element name="pname" type=" xs:string"/> <xs:element name="color" type=" xs:string"/> <xs:element name="supplier" maxOccurs="unbounded"> <xs:complexType> <xs:sequence> <xs:element name="sno" type=" xs:string"/> <xs:element name="sname" type=" xs:string"/> <xs:element name="city" type=" xs:string“ maxOccurs="unbounded"/> <xs:element name="price" type=" xs:string"/> <xs:element name="project" maxOccurs="unbounded"> <xs:complexType> <xs:sequence> <xs:element name="jno" type=" xs:string"/> <xs:element name="jname" type=" xs:string"/> <xs:element name="budget" type=" xs:string"/> <xs:element name="qty" type=" xs:string"/> </xs:sequence> </xs:complexType> </xs:element> </xs:sequence> </xs:complexType> </xs:element> </xs:sequence> </xs:complexType> </xs:element> </xs:sequence> </xs:complexType> <xs:keyname="PK"> <xs:selector xpath="part"/> <xs:field xpath="pno"/> </xs:key> </xs:element> </xs:schema> XSD definition ofelement occurrence constraint XSD definition ofkey constraint, which requires that all part element should have anon-nilpno element and the value of all pno elements in the document should beunique. 1. XML – XSD “psj.xsd”, the XSD schema of the motivating example data. Figure 3. XML Schema of Example XML document SWIIS, Bangkok
1. XML – what XSD can tell • XSD is the standard of XML schema definition, recommended by W3C and supported by most vendors, which • has extensible XML syntax, • supports more data types (user-defined type and 37 built-in types) • is able to represent uniqueness/keyfor both attribute types and element types. • And has many other improvements in comparison with DTD. SWIIS, Bangkok
1. XML – XSD still flaws XSD is not sufficient in expressing the relational semantics in XML data, such as: • A key constraint is specified by a keyelement. The key constraintsin XSD is an extension of ID in DTD. It is totally differentto the key constraintin relational databases. • E.g.In the previous XSD, the values of key attribute, pno of part, should be unique within the set of the part elements in the whole document. • Therefore, when an element type is located in a lower level such as supplier and project, XSD cannotdeclare sno and jno as theirkey attributes (OIDs) respectively. SWIIS, Bangkok
1. XML – XSD still flaws (cont.) • The keyelement must contain the following (in order): • One and only one selectorelement • contains an XPath expression that specifies the set of elements across which the values specified by the field must be unique • One or more field elements • contain an XPath expressions that specifies the values must be unique for the set of elements specified by the selector element. - The key constraint is similar to the uniqueconstraint, except that the column on which a unique constraint is defined canhave null values. SWIIS, Bangkok
1. XML – XSD still flaws (Cont.) • XSD does not support relationship typesand other relational semantic constraints. • E.g.The ternary relationship type psj among part, supplier and project in the original data is lost in the XSD. • XSD cannot distinguish attributes (or simple elements) of relationship types from those attributes (or simple elements) of element types. • E.g.Price is an attribute of thebinaryrelationship type ps between part and supplier. However, it looks the same as sname, an attribute (simple element) of the element supplier. SWIIS, Bangkok
Roadmap • XML documents and current XML schema languages • ORA-SS (Object-Relationship-Attributemodel for Semi-Structureddata) • The applications of ORA-SS • Semantic query optimization in XML • Conclusion SWIIS, Bangkok
2. ORA-SS in a nutshell • ORA-SS is a semantics rich data model for semi-structured data. • It can easily represent the relational semantics and constraints in XML data. • ORA-SS model is also a bridge that connects the tree structure of XML and the semantics in relationaland object-relational databases. • In comparison with traditional ERdiagram, ORA-SS schema diagram represents the hierarchical structure of XML data. SWIIS, Bangkok
2. ORA-SS in a nutshell • A complete ORA-SS model has 4 diagrams • Schema diagram • Represents the structure and constrains (business rules) on XML documents • Instance diagram • Visually represents the graphical structure of XML data • Functional dependency diagram • Represents FDs in relationship types • Inheritance diagram • Represents the specialization/generalization relationships among different object classes in ORA-SS SWIIS, Bangkok
2. ORA-SS data models • Object class • attributes of object class • orderingon object class • Relationship Type • degreeof relationship type • participating object classes in relationship type • attributesof relationship type • disjunctive relationship type • recursive relationship type • ID dependent relationship type SWIIS, Bangkok
2. ORA-SS data models (Cont.) • Attribute • attributes ofobject classorrelationship type • keyattribute (OID) • foreign key / referential constraint (IDREF/IDREFS) • composite attribute • disjunctive attribute • attribute with unknown structure • ordering on attributes • fixed or default value of attribute • derived attribute SWIIS, Bangkok
p a r t P S , 2 , + , + s u p p l i e r c o l o r p n a m e p n o P S J , 3 , + , + P S + p r o j e c t s n o s n a m e c i t y p r i c e P S J j n o j n a m e b u d g e t q t y The ORA-SS schema diagram of Example 1. PSis a binaryrelationship type between part and supplier, Part, supplier and project are modeled as object classes. PSJ is a ternary relationship type defined among part, supplier and project Pno, sno and jno are declared as the object ID of part, supplier and project respectively. Priceis an attribute of the relationship type PS; and qtyis an attribute of PSJ. Figure 4.ORA-SS schema diagram of Example XML document SWIIS, Bangkok
ORA-SS – Semantic Advantages • ORA-SS can represent the following semantics that DTD and XMLSchema cannot: • Attribute vs. object class • Multi-valued attribute vs. object class • Identifier (ID) • IDREF or Foreign Key • n-ary relationship type • Attribute ofobject class vs. attribute ofrelationship type • View of XML document SWIIS, Bangkok
Roadmap • XML documents and current XML schema languages • ORA-SS (Object-Relationship-Attribute model for Semi-Structured data) • The applications of ORA-SS • Semantic query optimization in XML • Conclusion SWIIS, Bangkok
3. ORA-SS applications • Due to the rich semantics in ORA-SS, the model can be widely used in • Normal form XML schema • Relational/object-relational storageof XML data • XML schema/data integration • XML query optimization [12] • XML aggregates evaluation • XML viewcreation and validation [2] • XML graphical query languageand output [7] • XML keyword search [13] • etc. We will illustrate these with in details [2]. Y. B. Chen, T. W. Ling, M. L. Lee. Designing Valid XML Views. ER2002, Tampere, Finland. Oct 7-11, 2002 [7]. W. Ni, T. W. Ling. GLASS: A Graphical Query Language for Semi-Structured Data. DASFAA 2003. [12]. H. Wu, T. W. Ling, B. Chen. VERT: a semantic approach for content search and content extraction in XML query processing. Submitted to ER’07 [13]. B. Chen, J. Lu, T. W. Ling. ICRA: effective semantics for ranked XML keyword search. Submitted to VLDB’07. SWIIS, Bangkok
3. ORA-SS applications Semantic query optimization • The semantic information represented in ORA-SS is helpful in optimizing XML query. • There are many algorithms proposed for XML query optimization, e.g. TwigStack [1] and its variants. • When ORA-SSsemantics of the data are known, they can be taken into account for query optimization. [1]. Nicolas Bruno, Nick Koudas, and Divesh Srivastava. Holistic Twig Joins: optimal XML Pattern Matching. SIGMOD Conference, 2002. SWIIS, Bangkok
3. ORA-SS applications Semantic query optimization Example: Consider the following simple query example which means, (Query 1) To display the budget of project “J001”. //project [jno = “J001”]/budget • Traditional processing should scan the whole XML document, checking every project with jno=“J001” and finding allcorresponding budget values. • However, in ORA-SS, since jno is the object ID and we have the functional dependecny: jno budget so the optimized processing only need to find the first project instance with jno=“J001” and return the corresponding budget value. SWIIS, Bangkok
3. ORA-SS applications Semantic query optimization –Content Search • Most existing algorithms focus on structural search of twig pattern queries • Few of them pay high attentions on content search for values of elements. • They treat content nodes (or values) the same as element nodes • Disadvantages: • Too many label streams of contents • Difficult to find the actual values of labels as output solutions • We propose VERT (Value Extractionwith Relational Table) SWIIS, Bangkok
3. ORA-SS applications Semantic query optimization – Content Search • Idea of VERT: • Introduce relational tables to store document valuesinstead oftreating them as nodes and labeling them. • Rewrite and optimize XML twig queries based on underlining relational tables. • Further optimize relational tables for query processing if more semantic information is available (i.e. more semantics better optimization). SWIIS, Bangkok
3. ORA-SS applications Semantic query optimization – Content Search 1.Introduce relational tables to store document valuesinstead oftreating them as nodes and labeling them. E.g. thevaluesfor price (title, etc) of XML tree in Figure 5 can be stored with the labels of price (title, etc)elements in Figure 6. Figure 5. Example XML document 2 Figure 6. Example VERT tables SWIIS, Bangkok
3. ORA-SS applications Semantic query optimization – Content Search 2.Rewrite and optimize XML twig queries based on underlining relational tables. e.g. • Rewrite the twig query in Figure 7(a) to the twig in Figure 7(b) • Execute SQL in table Rpriceof Figure 6 to get all labels of priceelements with value greater than 15 and form the stream Tprice>15 • Perform structural joins based on these labels for priceelements (i,e.Tprice>15 ) with book and ISBN elements Benefits: • Save stream merging of all price elements with values > 15 • Save structural join between price elements and their values (a) Twig query(b)rewritten query Figure 7. Example twig query SWIIS, Bangkok
3. ORA-SS applications Semantic query optimization – Content Search 3. Further optimize relational tables for query processing if some more semantic information is available (i.e. more semantics better optimization). Optimization 1 (VERT-1): put the value of price (title, etc) with labels of book objects since price (title) is a property of book object class according to semantics captured in ORA-SS (shown in Figure 8). Benefit: Further save structural joins between price and book & between ISBN and book for query in Figure 7 Figure 8. VERT tables with optimization 1 SWIIS, Bangkok
3. ORA-SS applications Semantic query optimization – Content Search 3.Further optimize relational tables for query processing if some more semantic information is available (i.e. more semantics better optimization). Optimization 2 (VERT-2): pre-merge the tables of title, price, etc. in Figure 8 if we further know they are single-valuedattributesof book object class according to semantics in ORA-SS (shown in Figure 9). (Note: should not merge multi-valued attribute, author.) Benefit: Save expensive structure joins by using an efficient selection on the table for query in Figure 7. Figure 9. VERT tables with optimization 2 SWIIS, Bangkok
3. ORA-SS applications Semantic query optimization – Content Search Experimental results on three datasets i.e. NASA, DBLP and XMark (Figure 10) • VERT outperforms TwigStack in query processing time • VERT-2 is superior to VERT-1, which is in turn better than original VERT. Figure 10. Experimental results of VERT SWIIS, Bangkok
3. ORA-SS applications XML query with aggregates • XML semantics captured in ORA-SS are crucial in correctly writing queries with aggregates Example.Consider the query: (Query 3.) Find theaverage budget of all the projects. Two potential XQuery expressions are:: XQ.3a for $pid in distinct_values(//project/jno) let $bgts := //project[jno = $pid]/budget return <avg_bgt>{avg($bgts)} </avg_bgt> XQ.3b let $bgts := //project/budget return <avg_bgt>{avg($bgts)} </avg_bgt> SWIIS, Bangkok
3. ORA-SS applications XML query with aggregates Example - cont. • If we know jno is the OID or key of project object class from ORA-SS, i.e. jno budget then we can easily judge that XQ.3a is a correct Xquery expression while XQ3.bis incorrect as some projects may appear more times than other projects in the XML document. • If we don’t know this semantics, it is difficult to say which XQuery expression is correct. SWIIS, Bangkok
3. ORA-SS applications Define and validate XML views • Valid XML views in ORA-SS • View definition operators:select, project/drop, swap, join For example, consider the following swapping operation that changes the position of supplier and part in different hierarchical levels: Becauseprice is a relationship attribute, it cannot be moved up with supplier elements, which would be semantically meaningless in the result view. Valid view Invalid view Figure 11. Example view definition 1 SWIIS, Bangkok
3. ORA-SS applications p a r t p r o j e c t p r i c e q t y Define and validate XML views Another example, consider the following projection operation that drops supplier from the structure: Invalid view Valid view Dropping supplier makes price and qtybecome multi-valued attributes, and we should apply aggregation functions to get a meaningful view. Figure 12. Example view definition 2 SWIIS, Bangkok
3. ORA-SS applications Graphical XML query based on ORA-SS A graphical XML query language is designed on the base of ORA-SS Query 1: To select and display the projects that do not have any suppliers located in Atlanta. The schema panel loads the ORA-SS schema diagram Graphical query can be posed by either dragging components from the diagram in schema panel or using the construction buttons on the top of the window. Complex query logics such as quantification, negation, IF-THEN construction can be specified in the Condition Logic Window Figure 13. The screenshot of the user-interface of our graphical query language SWIIS, Bangkok
3. ORA-SS applications XML keyword search with semantics • Keywordsearch is a user-friendly way to query XML documents. • Most existing algorithms are based on either tree data model or graph (digraph) data model of XML without the semantics. SWIIS, Bangkok
3. ORA-SS applications XML keyword search with semantics • Tree data model (LCA [11]) • Lowest Common Ancestor (LCA) • Contains the all keywords • Has no descendant node containing all the keywords • Graph (digraph) data model (Banks [5]) • Reduced sub-tree • A tree T in graph (digraph) containing all keywords • No proper sub-tree of T contains all keywords • Limitationsof keyword search without semantics • May have difficulty in representing results • May return many irrelevant results [5]. V. Kacholia, S. Pandit, S. Chakrabarti, S. Sudarshan, R. Desai, and H. Karambelkar. Bidirectional expansion for keyword search on graph databases. In Proc. of VLDB Conference, pages 505-516, 2005. [11] Y. Xu and Y. Papakonstantinou. Efficient keyword search for smallest LCAs in XML databases. In Proc. of SIGMOD Conference, pages 537-538, 2005. SWIIS, Bangkok
3. ORA-SS applications XML keyword search with semantics Example: • Q1 = {Widom} • LCA & reduced sub-tree give node 1.1.1 • Not enough information Figure 14. Example XML document 3 • Q2= {semistructured query processing} • LCA(Q2) = dblp (i.e. the whole XML database) …overwhelming information • Reduced sub-tree results includes all papers with either “semistructured” or “query processing”. However, notall “query processing” papers are about “semistructured”. SWIIS, Bangkok
3. ORA-SS applications XML keyword search with semantics • Therefore, we propose ICA (Interested Common Ancestor)and IRA (Interested Related Ancestors)to exploit the semantics for ranked keyword search. • Ideas: 1. DBA Defines the set of interested object classes and the conceptual connections between objects. e.g. in DBLP publications and author can be the interested object classes; the reference/citations can be one type of conceptual connection between publications. Note: we can group all publications for each author object. SWIIS, Bangkok
3. ORA-SS applications XML keyword search with semantics • Ideas: 2. The results of a keyword query include interested objects based on ICA and IRA semantics. • The results of ICA (Interested Common Ancestor) include all objects that each contains all query keywords • The results of IRA (Interested Related Ancestors) include all object pairs (o, o’) such that • the pair together contain all keywords AND • o and o’ are conceptually connected. Note: we output a list of IRA objects instead of IRA pairs. Intuitive meaning for IRA: For query “semistructured query processing”, if a paper P with title “query processing” cites or is cited by a paper with title “semistructured”, then P is considered related to the query; at least it is a better result than “query processing” papers that do not cite or are cited by “semistructured” papers. SWIIS, Bangkok
3. ORA-SS applications XML keyword search with semantics • Ideas: • The system automatically ranks result objects based on the following metrics for output. • RelevanceRank: Intuitive meaning: • for query “semistructured query processing”, • given two papers P1 and P2 containing“query processing”, • if P1 cites or is cited by many “semistructured” papers whereas P2 cites or is cited by few “semistructured” papers, then P1 is considered more relevant to the query. • Keyword Proximity Ranks (ProxRank): • Intuition: The less the number of elements in one object that directly contain all keywords, the better result the object is. SWIIS, Bangkok
3. ORA-SS applications XML keyword search with semantics Experimental evaluation based on DBLP • Our approach outperforms most existing academic demos in both execution time and result quality Figure 15. Execution time Figure 16. Comparisons of relevant result in top-10, 20, 30 answers among academic demos SWIIS, Bangkok
3. ORA-SS applications XML keyword search with semantics Experimental evaluation based on DBLP • Our approach is comparable or superior to commercial systems, Google Scholar and Microsoft Libra, in term of result quality even though they can search in much more web data. Figure 17. Comparisons of relevant result in top-10, 20, 30 answers with commercial systems SWIIS, Bangkok
3. ORA-SS applications XML keyword search with semantics A demo prototype of our keyword search system on DBLP data is available at http://xmldb.ddns.comp.nus.edu.sg Figure 18. User interface of the demo system SWIIS, Bangkok
Roadmap • XML documents and current XML schema languages • ORA-SS (Object-Relationship-Attribute model for Semi-Structured data) • The applications of ORA-SS • Semantic query optimization in XML • Conclusion SWIIS, Bangkok
4. Conclusion 1. We demonstrate a data-centric XML document and show the limitations of current XML schema standard in represent relational semanticsand constraints. SWIIS, Bangkok
4. Conclusion 2. We have shown that semantics in XML data are crucial in many applications, such as • XML query optimization • XML query optimization for content search • XML aggregate computation • XML viewcreation and validation • XML graphical query language and output • XML keyword search • etc. SWIIS, Bangkok
4. Conclusion 3. Many semantic information of XML data can be expressed in ORA-SS, which is a semantics rich data model, but not in DTD or XML Schema. SWIIS, Bangkok
References: [1] Nicolas Bruno, Nick Koudas, and Divesh Srivastava. Holistic Twig Joins: optimal XML Pattern Matching. SIGMOD Conference, 2002. [2]. Y. B. Chen, T. W. Ling, M. L. Lee. Designing Valid XML Views. ER2002, Tampere, Finland. Oct 7-11, 2002 [3]. C. J. Date. An Introduction to Database Systems. 3rd edition, Addison-Wesley Publishing Company (1981). [4]. Extensible Markup Language (XML) 1.0 (3rd Edition). W3C Recommendation 04 February 2004. http://www.w3.org/TR/2004/REC-xml-20040204/ [5]. V. Kacholia, S. Pandit, S. Chakrabarti, S. Sudarshan, R. Desai, and H. Karambelkar. Bidirectional expansion for keyword search on graph databases. In Proc. of VLDB Conference, pages 505-516, 2005. [6]. T. W. Ling, M. L. Lee, G. Dobbie. Semistructured Database Design. Springer Science+Business media, Inc. 2005 [7]. W. Ni, T. W. Ling. GLASS: A Graphical Query Language for Semi-Structured Data. DASFAA 2003. [8]. XML Schema Part 0: Primer Second Edition. W3C Recommendation 28 October 2004. http://www.w3.org/TR/2004/REC-xmlschema-0-20041028/ [9]. XML Schema Part 1: Structures Second Edition. W3C Recommendation 28 October 2004. http://www.w3.org/TR/2004/REC-xmlschema-1-20041028/ [10]. XML Schema Part 2: Data types Second Edition. W3C Recommendation 28 October 2004. http://www.w3.org/TR/2004/REC-xmlschema-2-20041028/ [11] Y. Xu and Y. Papakonstantinou. Efficient keyword search for smallest LCAs in XML databases. In Proc. of SIGMOD Conference, pages 537-538, 2005. [12]. H. Wu, T. W. Ling, B. Chen. VERT: a semantic approach for content search and content extraction in XML query processing. Submitted to ER’07 [13]. B. Chen, J. Lu, T. W. Ling. ICRA: effective semantics for ranked XML keyword search. Submitted to VLDB’07. SWIIS, Bangkok