200 likes | 218 Views
Explore the necessity of conceptual modeling for XML data representation, essential semantic information, object classes, and relationships. Learn about attributes, constraints, and how to capture complex data structures effectively.
E N D
Conceptual Modeling for XML Data Tok Wang Ling National University of Singapore DASFAA’2003 Panel Discussion
Outlines • Why do we need conceptual modeling? • What are the important semantic information to be captured? • Uses of conceptual model for some XML research topics
Motivation: Why do we need to have a conceptual model to represent XML Data? <department number =“cs”> <name> computer science</name> <course number = “cs4221”> <name> Database </name> <student number = “1234” > <name> B.Y.Smith</name> <grade> 70</grade> </student> <student number=“1235”> <name> C.U.Brown </name> <grade> 60</grade> </student> </course> </department> <! ELEMENT department (name,course+)> <! ATTLIST department number ID #REQUIRED> <! ELEMENT course (name, student*)> <! ATTLIST course number ID #REQUIRED> <! ELEMENT student (name, grade?)> <! ATTLIST student number CDATA #REQUIRED> <! ELEMENT name (#PCDATA)> <! ELEMENT grade (#PCDATA)> (b) An XML DTD for (a) (a) XML document
department &1 number name course course cs computer science &3 &2 name student number number name student student &7 &21 &22 &20 &4 &5 &6 ♦number Database ♦name ▼♦department cs4221 ▼ ♦course number ♦number name grade ♦name grade name number ▼♦student ♦number ♦name ♦grade &27 &26 &28 &25 &23 &24 C.U. Brown 1235 60 1234 B.Y.Smith 70 Motivation (cont.) (B) Dataguide (a) OEM Diagram Figure 1: Sample instance demonstrating OEM and dataguide
Motivation (continue) Q: What are the important semantic information and constraints cannot be captured by the DTD and Dataguide? • What are the object classes? department, course, student? • Attributes of object classes? • Identifiers of object classes? • What are the relationship types defined among object classes? e.g. Relationship types among department, course, student? • What is “grade”? Object class? Attribute of student? • Are there redundancies?
Semantic Information to be captured by an XML conceptual model • Object class • attributes of object class • ordering on object class • Relationship Type • Represent hierarchical structure • degree of n-ary relationship type • participation constraints of object classes in relationship type • attributes of relationship type • disjunctive relationship type • recursive relationship type • Reference
Semantic Information to be captured by an XML conceptual model (cont.) • Attribute • key attribute / identifier • composite attribute • disjunctive attribute • attributes with unknown structure • fixed and default values of attribute • derived attribute • Functional dependencies and other constraints • Inheritance hierarchy (class hierarchy) • Semi-structured data instance representation
Q: What are the semantic information cannot be represented by Dataguide, DTD, XML Schema? • Attribute or object class • Degree of relationship type • Attibute of object class or relationship type • Class hierarchy • Functional dependency • …
department 2, 1:n, 1:1 number name course cs student number name cs number name grade A solution: ORA-SS, an object-relationship attribute model for semi-structured data. department Number: CS Name: Computer Science course course student student Number student Name Number: CS4221 Name: Database number: 1235 Name: C.U. Brown Grade: 60 number: 1234 Name: B.Y. Smith Grade: 70 Figure 2: ORA-SS instance diagram Figure 3: ORA-SS schema diagram
project member publication p1 pub1 m1 pub2 p2 m2 p3 pub3 project ▼♦project 2, +, + ♦ id ♦ name ▼♦ member member ♦ name id name ♦ job title 2, *, + ▼♦ publication ♦ number name job title ♦title publication number title The data model of ORA-SS - Relationship Type • attributes of relationship type • degree of n-ary relationship type • participation constraints of objects in relationship type • disjunctive relationship type • recursive relationship type (c) Dataguide (a) ORA-SS Schema Diagram (b) Instance Relationship Figure 5: Representing binary relationship type
project member publication p1 pub1 m1 pub2 p2 m2 p3 pub3 project ▼♦project 2, +, + ♦ id ♦ name ▼♦ member member ♦ name id name ♦ job title 3, *, + ▼♦ publication ♦ number name job title ♦title publication number title The data model of ORA-SS - Relationship Type (cont.) (b) Instance Relationship (a) ORA-SS Schema Diagram (c) Dataguide Figure 6: Representing ternary relationship type
course cs 2, 4:n, 3:8 code * student title ANY cs cs dept prefix D: comp number * number first name last name mark grade The data model of ORA-SS - Attribute • key attribute • composite attribute • disjunctive attribute • attribute with unknown structure • fixed and default values of attribute • derived attribute hobby Figure 7: Object classes with relationship type and attributes in an ORA-SS schema diagram
Uses of the Conceptual model for XML research • Normal form XML schema • remove redundant data • resolve multiple inheritance conflicts • Storage structure for XML databases • use Object Relational Model • XML Views • derived information from references and class hierarchy • defining views • materialized view maintenance • view updates • Integration of XML documents • Evaluating XML queries on XML databases
professor 2 staff# name course 2 C.Code title textbook + title author ISBN Research Topics using ORA-SS ModelNormal Form XML Schema • Schema may have a lot of redundant data • Update anomalies • Normal Form schema is needed
professor course textbook C.Code name title staff# + courseR textbookR ISBN author title professor course textbook title name C.Code + ISBN title author professorR textbookR Research Topics using ORA-SS ModelNF XML Schema (cont.) • Some better solutions: • Redundancies are removed, in normal form textbook-Ref course-Ref textbook-Ref professor-Ref staff#
Research Topics using ORA-SS ModelStorage Structure for XML Databases • Main Rules • Each object class together with its attributes form a nested relation (object relation) • Each relationship type together with its attributes form a nested relation (relationship relation) • Nested relations can be handled by Object Relational model, e.g. ORACLE 8i.
Supplier SP 2 + Name City S# SP Price Part SPJ 3 P# Name Color Project SPJ J# Name Loc Qty Research Topics using ORA-SS ModelStorage Structure for XML Databases Object Relations Supplier (S#, Name, (City)) Part (P#, Name, color) Project(J#, Name, Loc) Relationship relations SP (S#, P#, price) SPJ (S#, P#, J#, Qty) Constraint: SPJ[S#, P#] SP[S#, P#]
course faculty course cs, 2,4:n,3:8 cs fs, 2,1:n,1:n faculty name title code title code student student studentR cs cs fs | grade faculty grade history | student number * grade student number hostel home hostel home grade course code street number street name number name Research Topics using ORA-SS ModelXML Views • What information can be directly derived from references and class hierarchy Referencing an object class in an ORA-SS schema diagram
Research Topics using ORA-SS ModelXML Views (cont.) • Valid views of an ORA-SS schema • Operations: selection, projection, join, up/down Project Project Part Supplier 2 2 2 Part Part Supplier Part 2 2 2 3 3 2 3 price price Supplier Project Project total_qty 2 3 3 3 price Qty Qty Qty View 1 View 2 View 3
Conclusion • A good conceptual model is needed for XML database applications: * normal form schema * storage structure * view design and view updates * ….