1 / 17

Spring 2011 Instructor: Hassan Khosravi

The Semistructure d -Data Model Programming Languages for XML. Spring 2011 Instructor: Hassan Khosravi. Semistructured Data. Another data model, based on trees. Self-describing : The data implicitly carries information about what its schema is.

mohawk
Download Presentation

Spring 2011 Instructor: Hassan Khosravi

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Semistructured-Data Model Programming Languages for XML Spring 2011 Instructor: Hassan Khosravi

  2. Semistructured Data • Another data model, based on trees. • Self-describing: • The data implicitly carries information about what its schema is. • May only carry the names of attributes (so possibly untyped), and has a lower degree of organization than the data in a relational database. • May have no associated schema (i.e. may be schema-less) • Motivation: • flexible representation of data. • sharing of documents among systems and databases. • Information integration • E.g. want to “merge” or query two databases. • Data exchange • E.g. two enterprises may want to exchange data (such as buyers and sellers)

  3. Semistructured Data representation

  4. Comparison with Relational Data • Inefficient: tags, which in effect represent schema information, are repeated • Access: data is structured hierarchically. • Better than relational tuples as a data-exchange format • Unlike relational tuples, semistructured data is self-documenting due to presence of tags • Flexible, non-rigid format: tags can be added • Allows nested structures • Wide acceptance, not only in database systems, but also in browsers, tools, and applications

  5. Flexibility in Schema

  6. XML • XML : Extensible Markup Language • A standard adopted in 1998 • While HTML uses tags for formatting (e.g., “italic”), XML uses tags for semantics (e.g., indicating “this is an address” or “this is a title”). • Key idea: create tag sets for a domain (e.g., genomics), and translate all data into properly tagged XML documents. • There are two different modes of use of XML: • Well-Formed XML allows you to invent your own tags. • No predefined schema • Valid XML conforms to a certain Document Type Descriptor DTD. • The DTD describes allowable tags and their nesting. • But still reasonably flexible – e.g. may allow optional or missing fields

  7. Well-Formed XML • Begins with a declaration that it is XML • It has a root element that is the entire body of the text

  8. Well-Formed XML Valid XML

  9. Valid XML • Document Type Descriptor (DTD) • Grammar-like language for specifying elements, attributes, nesting, ordering, #occurrences • Special attribute types ID and IDREF • Example

  10. Querying semistructured data

  11. Querying XML • Not nearly as mature as Querying relational • Newer • No underlying theory as in relational models • Sequence of development • Xpath – path expressions + conditions • Xquery – Xpath + full featured query language

  12. XPath • Think of XML as a tree • path expressions + conditions

  13. Xpath Syntax • Axes (to navigate around tree 13) • Parent:: • Following-sibling:: • Descendants:: • Self:: • /  root element • name of element “book” • Use name as * to match everything • @ISBN • //  matches all descendant • conditions [@price < 50] • [N]  nth child author [2]

  14. Xpath Demo • Example

  15. XQuery

  16. XQuery Demo • Example

More Related