390 likes | 497 Views
M ETA XPath. Curtis Dyreson E.E. and Computer Science Washington State University USA. Michael Böhlen and Christian S. Jensen Computer Science Aalborg University Denmark. Nykredit Center for Database Research Aalborg University, Denmark. Outline. Data Data model XML
E N D
METAXPath Curtis Dyreson E.E. and Computer Science Washington State University USA Michael Böhlen and Christian S. Jensen Computer Science Aalborg University Denmark Nykredit Center for Database Research Aalborg University, Denmark DC2001 Conference - Toyko
Outline • Data • Data model • XML • Query language • XPath • Metadata • METAXPath • Future work DC2001 Conference - Toyko
An XML Database Architecture Database Client (HTTP browser) HTTP server XML data and metadata DC2001 Conference - Toyko
Database Data Model Evolution 60s - Hierarchical data model 70s - Network data model 80s - Relational data model 90s - Object-oriented data model 00s - Unstructured/semistructured/XML • Innovators • Unstructured data models (UPenn) • UnQL/Strudel (AT&T) • OEM and Lore (Stanford) • XML (W3C) DC2001 Conference - Toyko
Object Exchange Model (OEM) • Heterogeneous OODBs • Exchange objects • Text description object 1 object 1 object 2 text (XML) your database my database DC2001 Conference - Toyko
<personid=&1 name=“Joe Doe” age=“25” /> Object Representation in XML • Use names and values • Ignore types • &X denotes object X <!ATTLISTperson id ID #REQUIRED> <!ELEMENTperson (name age)> // A person class class Person { String name; int age; } // A person object Person joe = new Person(‘Joe Doe’, 25); <personid=&1> <name>Joe Doe</name> <age>25</age> </person> DC2001 Conference - Toyko
XML (XPath) Data Model • Each element or attribute is a node • Edges indicate nesting • Nodes contain information • Tree is ordered root name=“Joe” attribute person element id=“&1” attribute <personid=&1 name=“Joe”> <age>25</age> </person> age element /n text /n text XML 25 text XPath DC2001 Conference - Toyko
Semistructured Data Model • Each element or attribute is a node • Edges indicate nesting • Edges are labeled person <personid=&1 name=“Joe”> <age>25</age> </person> &1 name age 25 Joe XML Semistructured DC2001 Conference - Toyko
Insensitive to text order, whitespace attributes vs. elements Directed graph (many roots, can contain cycles) Captures text order, whitespace, attributes and elements A tree (single root, no cycles) Data Models Compared root name=“Joe” attribute person person element id=“&1” attribute &1 name age element /n text /n text age 25 Joe 25 text Semistructured XPath DC2001 Conference - Toyko
Outline • Data • Data model • XML • Query language • XPath • Metadata • XML - METAXPath • Future work DC2001 Conference - Toyko
XPath • W3C Recommendation – 1999 • Used in XQuery, XSLT, and XPointer • Language for selecting locations in an XML document • Query • Sequence of location steps separated by ‘/’ • Location step axis::node_test [predicate1]…[predicateN] • Evaluated with respect to a context node • Results in a node-set (actually a list of nodes!) • Step continues from nodes reached in previous step DC2001 Conference - Toyko
Descendent Axis Example root person element SSN=“99…” attribute dateOfBirth element This… comment name element first element last element year element month element initial=“S” attribute 1981 text January text Douglas text Susan text DC2001 Conference - Toyko
Axes that Partition a Tree • Ancestor, descendent, following, preceding, and self partition a tree. ancestor self preceding following descendent DC2001 Conference - Toyko
XPath Node Test and Predicates • Each node in result-set must pass node test • Is this an element node named person? person • Is this an element node? * • Predicates are further tests (about other nodes) • Does node have a ssn attribute? [attribute::ssn] DC2001 Conference - Toyko
dateOfBirth element This… comment name element Example /child::person/child::*/child::last root root person element person element SSN=“99…” attribute dateOfBirth element This… comment name element last element first element last element last element year element month element initial=“S” attribute 1981 text January text Douglas text Susan text DC2001 Conference - Toyko
XPath Examples • The dateOfBirth children of person nodes /descendent::person/child::dateOfBirth • The last text node /descendent::text()[position()=last()] DC2001 Conference - Toyko
Abbreviated Syntax • Think of file path specifications in Unix • Year child of dateOfBirth child::dateOfBirth/child::year dateOfBirth/year • name siblings parent::*/child::name ../name • All year nodes /descendent-or-self::*/child::year //year DC2001 Conference - Toyko
Outline • Data • Data model • XML • Query language • XPath • Metadata • XML - METAXPath • Future work DC2001 Conference - Toyko
Metadata • Database metadata • Schema, security, transaction time (versions) • Web metadata • Author, language, subject, privacy • Web metadata recommendations • RDF, RDD, P3P • Features • Descriptive, but also exclusionary • Irregular • Multiple • Ad-hoc DC2001 Conference - Toyko
A Movie Database • Movie data • Bruce Willis stars in Colour of Night. • Colour of Night premiered 1/Jul/1995. • Publication meta-data language English URL http://www.auc.dk publication date 2/Apr/1997 privacy/security ‘over 18’ publication history v1.2, modified 31/Jul/1998 subject Film, Suspense, Thriller namespace http://www.auc.dk/movieDataDTD.xml DC2001 Conference - Toyko
Movie Database Queries • Metadata only • Retrieve information published at Danish web sites. • Metadata compared to data • Find reviews published in the first week of the movie’s release. • Metadata and data, but independent • Get suspense films starring Bruce Willis. DC2001 Conference - Toyko
Properties of a Metadata Data Model • Goal: Same query language for data and metadata • User learns “one” language • Compiler/optimization reuse • Challenges: Data and metadata in different dataspaces • Query on data should not accidently query metadata • Meta-metadata • Metadata for metadata • Metadata has semantics • Data with/without metadata DC2001 Conference - Toyko
METAXPath Data Model • Data model • Reuse XPath data model • Meta attribute points to metadata tree • “Right angle” data model • Features • Minimal extension of XPath • Backwards-compatible DC2001 Conference - Toyko
Example • Data <?xmlversion="1.0"> <personssn="234"> <name>Ichiro</name> </person> • URL metadata <source URL=“www.wsu.edu/p.htm”> • Language metadata of person element <language>English</language> • Author meta-metadata - language metadata author <authorname="Suzuki"/> DC2001 Conference - Toyko
Type root Type element Valueperson Attributes {(ssn, 223)} <?xmlversion="1.0"> <personssn="234"> <name>Ichiro</name> </person> Type element Valuename Attributes {} Type text Value \n\t Type text Value \n Type text Value Ichiro
Type root Meta Type root Type element Valuesource Attributes {(URL, www.wsu.edu/p.htm)} Type element Valueperson Attributes {(ssn, 223)} Type element Valuename Attributes {} Type text Value \n\t Type text Value \n Type text Value Ichiro <source URL=“www.wsu.edu/p.htm”>
Type root Meta Type root Type element Valuesource Attributes {(URL, www.wsu.edu/p.htm)} Type element Valueperson Attributes {(ssn, 223)} Meta Type root Type element Valuelanguage Attributes {} Type element Valuename Attributes {} Type text Value \n\t Type text Value \n Type text Value English Type text Value Ichiro <language>English</language>
Type root Meta Type root Type element Valuesource Attributes {(URL, www.wsu.edu/p.htm)} Type element Valueperson Attributes {(ssn, 223)} Meta Type root Meta Type root Type element Valueauthor Attributes {(name, Suzuki)} Type element Valuelanguage Attributes {} Type element Valuename Attributes {} Type text Value \n\t Type text Value \n Type text Value English Type text Value Ichiro <authorname="Suzuki"/>
Sharing and Excluding Metadata • Meta property points to metadata for a node • Shared pointers ==> shared metadata • To share with child • Copy pointer • To exclude from child • Duplicate excluded portion • Copy remaining shared pointers DC2001 Conference - Toyko
Type root Meta Type root Type element Value source Attributes {(URL, www.wsu.edu/p.htm)} Type element Valueperson Attributes {(ssn, 223)} Meta Type root Meta Type root Type element Valueauthor Attributes {(name, Suzuki)} Type element Valuelanguage Attributes {} Meta Type element Valuename Attributes {} Meta Type text Value \n\t Meta Type text Value \n Meta Type text Value English Meta Type text Value Ichiro Meta Share metadata with descendents
Type root Meta Type root Type element Valuesource Attributes {(URL, www.wsu.edu/p.htm)} Type element Valueperson Attributes {(ssn, 223)} Meta Type root Meta Type root Type element Valueauthor Attributes {(name, Suzuki)} Type element Valuelanguage Attributes {} Meta Type element Valuename Attributes {} Meta Type text Value \n\t Meta Type text Value \n Meta Type text Value English Meta Ichiro text not authored by Suzuki Type text Value Ichiro Meta Type root Meta
METAXPath Queries • XPath plus level shift operation • meta axis • ^ in abbreviated syntax • Example - Locate data nodes with URL metadata of p.htm /descendent-or-self::* [meta::*/child::source[attribute::URL="p.htm"]] • In abbreviated syntax //*[^source[@URL="p.htm"]] • Example - Locate the URL metadata //*^source/@URL • Example - Locate data that has metadata authored by Suzuki (meta-metadata) //*[^//*^author[@name="Suzuki"]] DC2001 Conference - Toyko
Outline • Data • Data model • XML • Query language • XPath • Metadata • XML - METAXPath • Future work DC2001 Conference - Toyko
&1 &1 Not a path! name: reviewed trans. time: [1/Sep/1999 - uc] &2 &3 name: title trans. time: [1/Aug/1998 - uc] Colour of Night Metadata Semantics • Transaction time example &2 name: movie &3 name: title trans. time: [2/Apr/1997 - 31/Jul/1998] Color of Night DC2001 Conference - Toyko
name: reviewed.movie.title trans. time: [1/Sep/1999 - uc] name: reviewed.movie.title trans. time: undefined AUCQL Collapse Example • PropertyCollapse for name is concatenation, for trans. time it is temporal intersection. &1 name: reviewed trans. time: [1/Sep/1999 - uc] &2 name: movie &3 name: title trans. time: [1/Aug/1998 - uc] name: title trans. time: [2/Apr/1997 - 31/Jul/1998] Colour of Night Color of Night DC2001 Conference - Toyko
&1 name: review security! subscriber trans. time: [16/Jul/1999 - uc] name: review security! developer trans. time: [1/Jul/1999 - 15/Jul/1999] &2 trans. time: [1/Jul/1999 - uc] AUCQL Additional Operations • Coalesce - compute a distributed property value DC2001 Conference - Toyko
Thin Layer Impementation result METAXPath query Metadata encoding METAXPath Compiler XPath query XPath Compiler DB DC2001 Conference - Toyko
Indexing XML XML Parser RDF Prototype Implementation result METAXPath query METAXPath Compiler Perl Evaluation Tree Query Evaluation Engine Perl DBM Database API DC2001 Conference - Toyko
Summary • METAXPath website • http://www.eecs.wsu.edu/~cdyreson/pub/MetaXPath • AUCQL website • VLDB ‘99 • Implemented research prototype • Free, downloadable, Unix environment • http://www.eecs.wsu.edu/~cdyreson/pub/AUCQL • Interactive query engine • Tutorials DC2001 Conference - Toyko