260 likes | 464 Views
Maziar Sanaii Ashtiani – 105405 SCT – EMU, Fall 2011/12. Chapter 23 XML. Introduction and Motivation. HTTP Standard Generalized Markup Language eXtensible Markup Language Useful as a data format to exchange between apps Markup means something not mentioned in the document
E N D
Maziar Sanaii Ashtiani – 105405 SCT – EMU, Fall 2011/12 Chapter 23XML
Introduction and Motivation • HTTP • Standard Generalized Markup Language • eXtensible Markup Language • Useful as a data format to exchange between apps • Markup means something not mentioned in the document • Has tags enclosed n angle brackets • <title>Database Systems Concepts</title>
Freedom <university> <department> <dept name> Comp. Sci. </dept name> <building> Taylor </building> <budget> 100000 </budget> </department> <course> <course id> CS-101 </course id> <title> Intro. to Computer Science </title> <dept name> Comp. Sci </dept name> <credits> 4 </credits> </course> <instructor> <IID> 10101 </IID> <name> Srinivasan </name> <dept name> Comp. Sci. </dept name> <salary> 65000 </salary> </instructor> <teaches> <IID> 10101 </IID> <course id> CS-101 </course id> </teaches> </university>
Advantages • Tags are self documenting • No rigid format • Can evolve over time • Nested structures • Widely accepted • Lots of tools XML has become THE dominant format for data exchange
Structure • Elements • Single root • Proper nesting • <course> . . . <title> . . . </title> .. . </course> • <course> . . . <title> . . . </course> ... </title> • Text in the context of an element • May be mixed with subelements • Nesting to avoid joins (fig. 23.5, 23.6)
Structure (Cont’d) • Attributes • name= value • Strings • Useful as identifiers • Namespace • <university xmlns:yale=“http://www.yale.edu”> • Literal values • <![CDATA[<course> · · ·</course>]]>
XML Document Schema • Databases have schemas • XML • Document Type Definition • XML Schema • Relax NG
DTD <!DOCTYPE university [ <!ELEMENT university ( (department|course|instructor|teaches)+)> <!ELEMENT department ( dept name, building, budget)> <!ELEMENT course ( course id, title, dept name, credits)> <!ELEMENT instructor (IID, name, dept name, salary)> <!ELEMENT teaches (IID, course id)> <!ELEMENT dept name( #PCDATA )> <!ELEMENT building( #PCDATA )> <!ELEMENT budget( #PCDATA )> <!ELEMENT course id ( #PCDATA )> <!ELEMENT title ( #PCDATA )> <!ELEMENT credits( #PCDATA )> <!ELEMENT IID( #PCDATA )> <!ELEMENT name( #PCDATA )> <!ELEMENT salary( #PCDATA )> ] >
DTD (Cont’d) <!DOCTYPE university-3 [ <!ELEMENT university ( (department|course|instructor)+)> <!ELEMENT department ( building, budget )> <!ATTLIST department dept_nameID #REQUIRED > <!ELEMENT course (title, credits )> <!ATTLIST course course_idID #REQUIRED dept_nameIDREF #REQUIRED instructors IDREFS #IMPLIED > <!ELEMENT instructor ( name, salary )> <!ATTLIST instructor IID ID #REQUIRED > deptname IDREF #REQUIRED > · · · declarations for title, credits, building,budget, name and salary · · · ] >
DTD Limitations • No constraints • Data verification needed • No limit over occurrence • Lack of typing for ID and IDREF
XML Schema • Result of deficiencies in DTD • Has string, integer, decimal,… • User defined types
XML Schema (Cont’d) <xs:schemaxmlns:xs=“http://www.w3.org/2001/XMLSchema”> <xs:element name=“university” type=“universityType” /> <xs:element name=“department”> <xs:complexType> <xs:sequence> <xs:element name=“dept name” type=“xs:string”/> <xs:element name=“building” type=“xs:string”/> <xs:element name=“budget” type=“xs:decimal”/> </xs:sequence> </xs:complexType> </xs:element> <xs:element name=“course”> <xs:element name=“course id” type=“xs:string”/> <xs:element name=“title” type=“xs:string”/> <xs:element name=“dept name” type=“xs:string”/> <xs:element name=“credits” type=“xs:decimal”/> </xs:element> <xs:complexType name=“UniversityType”> <xs:sequence> <xs:element ref=“department” minOccurs=“0” maxOccurs=“unbounded”/> <xs:element ref=“course” minOccurs=“0” maxOccurs=“unbounded”/> <xs:element ref=“instructor” minOccurs=“0” maxOccurs=“unbounded”/> <xs:element ref=“teaches” minOccurs=“0” maxOccurs=“unbounded”/> </xs:sequence> </xs:complexType> </xs:schema> • <xs:attribute name = “dept name”/>
XML Schema (Cont’d) • PK <xs:key name = “deptKey”> <xs:selectorxpath = “/university/department”/> <xs:fieldxpath = “dept name”/> </xs:key> • FK <xs: name = “courseDeptFKey” refer=“deptKey”> <xs:selectorxpath = “/university/course”/> <xs:fieldxpath = “dept name”/> </xs:keyref>
XML Schema Benefits • Constraints • User-defined types • PK and FK • Integrated namespaces • Min and Max values • Type extension by inheritence
Query and Transformation • XPath • Language for path expressions • XQuery • Standard language for querying XML • Modeled after SQL but different • Deal with nested XML data
Tree Model of XML and XPath • Trees and nodes • Elements and attributes • XPath 2.0 • /university-3/instructor/name • <name>Srinivasan</name> • <name>Brandt</name>
XPath features • Selection • /university-3/course[credits >= 4]/@course id • Functions • Count() • /university-2/instructor[count(./teaches/course)> 2] • id(“foo”) • Union “|” • …
XQuery • W3C • XQuery 1.0 • For • Let • Where • Order by • Return
XQuery (Cont’d) for $x in /university-3/course let $courseId := $x/@course_id where $x/credits > 3 return <course_id> { $courseId } </course_id> is equivalent to for $x in /university-3/course[credits > 3] return <course_id> { $x/@course id } </course_id>
XQuery Joins for $c in /university/course, $i in /university/instructor, $t in /university/teaches where $c/course_id= $t/course id and $t/IID = $i/IID return <course_instructor> { $c $i } </course_instructor> which is equivalent to for $c in /university/course, $i in /university/instructor, $t in /university/teaches[ $c/course id= $t/course id and $t/IID = $i/IID] return <course_instructor> { $c $i } </course_instructor>
Functions and Types declare function local:dept_courses($iid as xs:string) as element(course)* { for $i in /university/instructor[IID = $iid], $c in /university/courses[dept name = $i/dept_name] return $c }
API to XML • Document Object Model • JAVA DOM API • Simple API for XML • Event model
Storage of XML Data • Non-relational Data Stores • Flat files (NO ACID) • XML Database • DOM C++-based
Storage of XML Data (Cont’d) • Relational Databases • Store as string • clob • Tree Representation • Map to Relations • Publishing and Shredding XML Data • Native Storage within Relational Database
SQL/XML select xmlelement(name “course”, xmlattributes(course id as course id, dept name as dept name), xmlelement(name “title”, title), xmlelement(name “credits”, credits)) from course
XML Applications • Storing Data With Complex Structure • ODF • OOXML • Standardized Data Exchange Format • B2B • Web Services – HTTP • SOAP • WSDL • Data Mediation – Wrappers