420 likes | 599 Views
第三章 信息管理. 龚 斌 山东大学计算机科学与技术学院 山东省高性能计算中心. XML 格式信息查询. What is XSL?. XSL stands for e X tensible S tylesheet L anguage (可扩充的样式表单语言) . The World Wide Web Consortium (W3C) started to develop XSL because there was a need for an XML based Stylesheet Language.
E N D
第三章 信息管理 龚 斌 山东大学计算机科学与技术学院 山东省高性能计算中心
What is XSL? • XSL stands for eXtensible Stylesheet Language(可扩充的样式表单语言). • The World Wide Web Consortium (W3C) started to develop XSL because there was a need for an XML based Stylesheet Language. • 是为XML文件定义的一种标识语言,它将提供远远超过CSS (Cascading Style Sheets层叠样式表单)的强大功能,如将元素再排序等。实际上简单的XML已可被CSS所解释,然而复杂的高度结构化的XML数据或XML文档则只能依赖于XSL极强的格式化的能力而现给用户。
CSS - HTML Style Sheets • CSS是Cascading Style Sheets(层叠样式表单) • HTML uses predefined tags and the meanings of tags are well understood. • The <table> element defines a table and a browser knows how to display it.
XSL - XML Style Sheets • XML does not use predefined tags and the meanings of these tags are not well understood. • The <table> could mean an HTML table or a piece of furniture, and a browser does not know how to display it.
XSL - More Than a Style Sheet Language • XSL consists of three parts: • XSLT is a language for transforming XML documents一种转换XML的语言 • XPath is a language for defining parts of an XML document一种定义XML部分或模式的语言 • XSL-FO is a language for formatting XML documents一种定义XML显示方式的语言
What is XPath? • XPath is a language for defining parts of an XML document. • XPath is a set of syntax rules for defining parts of an XML document. • XPath uses paths to define XML elements • XPath defines a library of standard functions • XPath is a major element in XSLT • XPath is not written in XML • XPath is a W3C Standard • 为查找XML文档中的特定信息定义了一套语法规则。
Like Traditional File Paths • XPath uses path expressions to identify nodes in an XML document. • These path expressions look very much like a computer file system: w3schools/xpath/default.asp
XPath Example • selects the ROOT element catalog: • selects all the cd elements of the catalog element: /catalog /catalog/cd
XPath • XPath Syntax • XPath uses path expressions to locate nodes within XML documents. • http://xml.ee.ncku.edu.tw/~chip/lectures/web.2004/my/xpath_syntax.htm • XPath Location Paths • A location path expression results in a node-set. • http://www.w3schools.com/xpath/xpath_location.asp
Location Path Expression • An absolute location path: /step/step/... • A relative location path: step/step/...
XPath • XPath Expressions • XPath supports numerical, equality, relational, and Boolean expressions. • http://www.w3schools.com/xpath/xpath_expressions.asp
XPath Functions • XPath contains a function library for converting data. • http://www.w3schools.com/xpath/xpath_functions.asp
XPath Functions • Accessor • AnyURI • Node • Error and Trace • Boolean • Sequence • Numeric • Duration/Data/Time • Context • String • QName
XPath Examples • Test a drive of XPath • http://www.w3schools.com/xpath/xpath_examples.asp • XPath Examples • http://www.zvon.org/xxl/XPathTutorial/General/examples.html
XPath Examples • XPath Explore • XPath Explorer (XPE) is a GUI application that lets you interactively experiment with XPath. Given an xpath and URL (to an HTML or XML document), it displays matching nodes and their values. This makes it easy to play with and debug your XPath expression • http://sourceforge.net/projects/xpe/ • http://xml.ee.ncku.edu.tw/~chip/lectures/web.2004/xpe.jar • Download xpe.jar • Execute • java –jar xpe.jar
Conclusions • XSL consists of three parts: XSLT, XPath, XSL-FO • XPath uses path expressions to identify nodes in an XML document.
Reference • Learn XSL • http://www.w3schools.com/xsl/xsl_languages.asp • Learn XPath • http://www.w3schools.com/xpath/default.asp
What is XQuery? • XQuery is about extracting information from XML documents. • XQuery is a language for querying XML data • XQuery is built on XPath expressions • XQuery for XML is like SQL for databases
XQuery is About Querying XML • XQuery is a language for finding and extracting (querying) data from XML documents. • An example of XQuery: • "Select all CD records with a price less than $10 from the CD collection stored in the XML document called cd_catalog.xml"
XQuery vs. XPath • XPath • Common language for navigation, selection, extraction • Used in XSLT, XQuery, XPointer(XML文件中定位数据的一种语言), XML Schema, XForms(XML表单), et al • XQuery 1.0 and XPath 2.0 shares the same data model, the same functions, and the same syntax.
XQuery vs. XSLT • XSLT 1.0: XML XML, HTML, Text • Loosely-typed scripting language • Format XML in HTML for display in browser • Must be highly tolerant of variability/errors in data • XQuery 1.0: XML XML • Strongly-typed query language • Large-scale database access • Must guarantee safety/correctness of operations on data
XQuery vs. XSLT • Over time, XSLT & XQuery may both serve needs of many application domains • XQuery will become a hidden, commodity language
XQuery 1.0 • XQuery 1.0 = XPath 2.0 for navigation, selection, extraction + A few more expressions For-Let-Where-Order By-Return (FLWOR) XML construction Operators on types + User-defined functions & modules Modularize large queries Process recursive data + Strong typing Checks values of required type (operator, function) Guarantees result value instance of output type Enforced statically or dynamically
Some XQuery Examples • http://xml.ee.ncku.edu.tw/~chip/lectures/web.2004/my/xquery-example.htm
XQuery FLWOR • For-Let-Where-Return • FLWOR • For and Let • bind values to variables • Where • filters these bindings • Order by • sorts the surviving bindings • Return • puts results together
举例 • <?xml version="1.0" encoding="ISO-8859-1"?><bib> <book year="1994"> <title>TCP/IP Illustrated</title> <author><last>Stevens</last><first>W.</first></author> <publisher>Addison-Wesley</publisher> <price>65.95</price> </book> <book year="1992"> <title>Advanced Programming in the Unix environment</title> <author><last>Stevens</last><first>W.</first></author> <publisher>Addison-Wesley</publisher> <price>65.95</price> </book> <book year="2000"> <title>Data on the Web</title> <author><last>Abiteboul</last><first>Serge</first></author> <author><last>Buneman</last><first>Peter</first></author> <author><last>Suciu</last><first>Dan</first></author> <publisher>Morgan Kaufmann Publishers</publisher> <price>39.95</price> </book> <book year="1999"> <title>The Technology and Content for Digital TV</title> <editor> <last>Gerbarg</last><first>Darcy</first> <affiliation>CITI</affiliation> </editor> <publisher>Kluwer Academic Publishers</publisher> <price>129.95</price> </book> </bib>
Extracting Nodes With FLWOR The following FLWOR expression:
Extracting Nodes With FLWOR • The for clause selects all book nodes into a variable called $x. • The where clause selects only the $x nodes with price elements with a value greater than 50. • The order by clause orders the $x nodes by the value of the title elements. • The return clause returns the title nodes.
Extracting Nodes With FLWOR <title>Advanced Programming in the Unix environment</title><title>TCP/IP Illustrated</title><title>The Technology and Content for Digital TV</title>
Another Example • XML document describing catalog of books <?xml version="1.0" encoding="ISO-8859-1" ?> <catalog> <book isbn="ISBN 1565114302"> <title>No Such Thing as a Bad Day</title> <author>Hamilton Jordan</author> <publisher>Longstreet Press, Inc.</publisher> <price currency="USD">17.60</price> <review> <reviewer>Publisher</reviewer>: This book is the moving account of one man's successful battles against three cancers ...<title>No Such Thing as a Bad Day</title> is warmly recommended. </review> </book> <!-- more books and specifications --> </catalog>
Concepts inside FLWOR • For each author, return number of books and receipts books published in past 2 years, ordered by name let $cat := fn:doc(“www.bn.com/catalog.xml“), $sales := fn:doc(“www.publishersweekly.com/sales.xml“) for $a in distinct-values($cat//author) Selection let $books := $cat//book[@year >= 2000 and author = $a], Join $receipts := $sales/book[@isbn = $books/@isbn]/receipts order by $a Ordering return <sales> XML Construction { $a } <count> { fn:count($books) } </count> Aggregation <total> { fn:sum($receipts) } </total> </sales>
XQuery functions • fn:doc • returns a document node • fn:distinct-values • eliminates duplicates (by value) from a list of nodes • fn:count • returns the number of the nodes • fn:sum • returns the sum of the nodes
XQuery Online Demo • XQuery Use Case • http://www.w3.org/TR/xquery-use-cases/ • X-Hive/DB • http://support.x-hive.com/xquery/index.html • Galax • http://www.galaxquery.org/
XQuery Implementation • Microsoft SQL Server 2005(Yukon) • http://www.microsoft.com/sql/2005/default.asp • SAXON:The XSLT and XQuery Processor • http://saxon.sourceforge.net/ • Qexo - The GNU Kawa implementation of XQuery • http://www.gnu.org/software/qexo/ • X-Hive/DB:The fastest and most scalable XML database powered by open standards • http://www.x-hive.com/products/db/index.html
Running XQuery processor • Saxon • http://saxon.sourceforge.net/ • Execute XQuery From command line • java -classpath saxon8.jar net.sf.saxon.Query q1.xq
Conclusions • XQuery is a language for finding and extracting (querying) data from XML documents.
Reference • Learn XQuery • http://www.w3schools.com/xquery/default.asp • Galax XQuery Implementation • http://www.galaxquery.org/doc.html#presentations • Essential XQuery - The XML Query Language • http://www.yukonxml.com/articles/xquery/