420 likes | 434 Views
This PhD dissertation defense supported by NSF focuses on developing a simple conceptual model for XML and tools to manage XML data storage effectively. It explores transformations, schema development, and observations.
E N D
Conceptual XML for Systems Analysis Reema Al-Kamha PhD Dissertation Defense Supported by NSF
Motivation Since XML is now a standard for data representation There is a need for • A simple conceptual model for XML • Tools to • Develop schemas for XML data storage • Reverse-engineer XML storage structures to a conceptual model for further development
Dissertation Contributions • Conceptual-XML (C-XML) • Transformations • C-XML to XML Schema (to develop schemas for XML data storage) • XML Schema to C-XML (to reverse-engineer XML storage structures to a conceptual model for further development) • Observations and recommendations
Conceptual XML (C-XML) • C-XML has good conceptual-modeling characteristics • Satisfies conceptual modeling requirements [Nec06, SW06,Wild05] • Graphical notation • Formal foundation • Structural independence • Reflection of the mental model • n-ary relationship sets • Cardinality for all participants • Ordering • Allowance for irregular and heterogeneous structure …
C-XML XML Schema <?xml version="1.0" encoding="UTF-8"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified"> <xs:element name="Root"> <xs:complexType> <xs:all> <xs:element ref="Students"/> <xs:element ref="Courses"/> <xs:element ref="GradStudents"/> <xs:element ref="UndergradStudents"/> </xs:all> </xs:complexType> <xs:keyref name="UndergradStudentOID-Keyref" refer="StudentOID-Key"> <xs:selector xpath="./UndergradStudents/UndergradStudent"/> <xs:field xpath="@UndergradStudentOID"/> </xs:keyref> <xs:keyref name="GradStudentOID-Keyref" refer="StudentOID-Key"> <xs:selector xpath="./GradStudents/GradStudent"/> <xs:field xpath="@GradStudentOID"/> </xs:keyref> </xs:element> <xs:element name="Students"> <xs:complexType> <xs:sequence> <xs:element name="Student" maxOccurs="unbounded"> <xs:complexType> <xs:sequence> <xs:choice minOccurs="1" maxOccurs="1"> <xs:element name="StudentName" type="xs:string"/> <xs:sequence> <xs:element name="FirstName" type="xs:string"/> <xs:element name="MiddleNames"> <xs:complexType> <xs:sequence> <xs:element name="MiddleName" minOccurs="0" maxOccurs="2"> <xs:complexType> <xs:attribute name="MiddleName" type="xs:string" use="required"/> </xs:complexType> </xs:element> </xs:sequence> </xs:complexType> <xs:key name="MiddleName-Key"> <xs:selector xpath="./MiddleName"/> <xs:field xpath="@MiddleName"/> </xs:key> </xs:element> <xs:element name="LastName" type="xs:string"/> </xs:sequence> </xs:choice> <xs:element name="Semester-Course-Grades"> <xs:complexType> <xs:sequence> <xs:element name="Semester-Course-Grade" minOccurs="0" maxOccurs="unbounded"> <xs:complexType> <xs:attribute name="Semester" use="required"/> <xs:attribute ref="Course" use="required"/> <!-- C-XML: forall x (Course(x)=>exists [0:*] <x1, x2, x3> (Course(x) Student(x1) Semester(x2) Grade(x3) )) --> <xs:attribute name="Grade" type="xs:string" use="required"/> </xs:complexType> </xs:element> </xs:sequence> </xs:complexType> <xs:key name="Semester-Course-Grade-Key"> <xs:selector xpath="./Semester-Course-Grade"/> <xs:field xpath="@Semester"/> <xs:field xpath="@Course"/> <xs:field xpath="@Grade"/> </xs:key> </xs:element> </xs:sequence> <xs:attribute name="StudentOID" type="xs:string" use="required"/> <xs:attribute name="StudentID" type="xs:string" use="required"/> </xs:complexType> </xs:element> </xs:sequence> </xs:complexType> <xs:key name="StudentOID-Key"> <xs:selector xpath="./Student"/> <xs:field xpath="@StudentOID"/> </xs:key> <xs:key name="StudentID-Key"> <xs:selector xpath="./Student"/> <xs:field xpath="@StudentID"/> </xs:key> </xs:element> <xs:element name="Courses"> <xs:complexType> <xs:sequence> <xs:element name="Course" maxOccurs="unbounded"> <xs:complexType> <xs:attribute ref="Course" use="required"/> <xs:attribute name="Department" type="xs:string" use="required"/> </xs:complexType> </xs:element> </xs:sequence> </xs:complexType> <xs:key name="Course-Key"> <xs:selector xpath="./Course"/> <xs:field xpath="@Course"/> </xs:key> </xs:element> <xs:element name="GradStudents"> <xs:complexType> <xs:sequence> <xs:element name="GradStudent" maxOccurs="unbounded"> <xs:complexType> <xs:attribute name="GradStudentOID" type="xs:string" use="required"/> <xs:attribute name="Advisor" use="required"/> </xs:complexType> </xs:element> </xs:sequence> </xs:complexType> <xs:key name="GradStudentOID-Key"> <xs:selector xpath="./GradStudent"/> <xs:field xpath="@GradStudentOID"/> </xs:key> </xs:element> <xs:element name="UndergradStudents"> <xs:complexType> <xs:sequence> <xs:element name="UndergradStudent" maxOccurs="unbounded"> <xs:complexType> <xs:attribute name="UndergradStudentOID" type="xs:string" use="required"/> </xs:complexType> </xs:element> </xs:sequence> </xs:complexType> <xs:key name="UndergradStudentOID-Key"> <xs:selector xpath="./UndergradStudent"/> <xs:field xpath="@UndergradStudentOID"/> </xs:key> </xs:element> <xs:attribute name="Course" type="xs:string"/> </xs:schema>
Algorithm Overview • Generate a forest of scheme trees • Translate an individual object set • Translate an individual node • Create a root node • Add global uniqueness constraints • Translate generalization/specialization hierarchies
Generate Scheme Trees (Student, StudentID, StudentName, FirstName, LastName, (MiddleName)*, (Course, Semester, Grade)*)*
Generate Scheme Trees (Course, Department)*
Generate Scheme Trees (UndergradStudent)* (GradStudent, Advisor)*
Generate Scheme Trees (Student, StudentID, StudentName, FirstName, LastName, (MiddleName)*, (Course, Semester, Grade)*)* (Course, Department)* (GradStudent, Advisor)* (UndergradStudent)*
MiddleName Course, Semester, Grade GradStudent, Advisor UndergradStudent Course, Department Generate Scheme Trees Student, StudentID, StudentName, FirstName, LastName (Student, StudentID, StudentName, FirstName, LastName, (MiddleName)*, (Course, Semester, Grade)*)* (Course, Department)* (GradStudent, Advisor)* (UndergradStudent)*
Individual Object Sets <xs:attribute name="Department" type="xs:string"/> <xs:attribute name="Course" type="xs:string"/> <xs:attribute ref="Course"/> <xs:element name="FirstName" type="xs:string"/> <xs:element name="Student"> <xs:complexType> ... <xs:attribute name="StudentOID" type="xs:string" use="required"/> </xs:complexType> </xs:element>
Nodes MiddleNames Students Student Students MiddleNames Course-Semester-Grades MiddleName Course-Semester-Grade UndergradStudents Courses GradStudents UndergradStudent Course GradStudent
Nodes <xs:element name="Semester-Course-Grades"> <xs:complexType> <xs:sequence> <xs:element name="Semester-Course-Grade" minOccurs="0" maxOccurs="unbounded"> <xs:complexType> ... </xs:complexType> </xs:element> </xs:sequence> </xs:complexType> ... </xs:element> <xs:element name="Students"> <xs:complexType> <xs:sequence> <xs:element name="Student" maxOccurs="unbounded"> <xs:complexType> ... </complexType> </xs:element> </xs:sequence> </xs:complexType> </xs:element>
Nodes <xs:element name="Semester-Course-Grade" minOccurs="0" maxOccurs="unbounded"> <xs:complexType> <xs:attribute name="Semester" use="required"/> <xs:attribute ref="Course" use="required"/> <!-- C-XML: forall x (Course(x)=> exists [0:*] <x1, x2, x3> (Course(x) Student(x1) Semester(x2) Grade(x3) )) --> <xs:attribute name="Grade" type="xs:string" use="required"/> </xs:complexType> </xs:element>
Root Element <xs:schema > <xs:element name="Root"> <xs:complexType> <xs:all> <xs:element ref="Students"/> <xs:element ref="Courses"/> <xs:element ref="GradStudents"/> <xs:element ref="UndergradStudents"/> </xs:all> </xs:complexType> ... </xs:element> ... </xs:schema> Students Courses GradStudents UndergradStudents
Uniqueness Constraints <xs:element name="Students"> <xs:complexType> <xs:sequence> <xs:element name="Student" maxOccurs="unbounded"> <xs:complexType> ... </xs:complexType> </xs:element> </xs:sequence> </xs:complexType> <xs:key name="StudentOID-Key"> <xs:selector xpath="./Student"/> <xs:field xpath="@StudentOID"/> </xs:key> <xs:key name="StudentID-Key"> <xs:selector xpath="./Student"/> <xs:field xpath="@StudentID"/> </xs:key> </xs:element>
Generalization/Specialization <xs:keyref name="UndergradStudentOID-Keyref" refer="StudentOID-Key"> <xs:selector xpath="./UndergradStudents/UndergradStudent"/> <xs:field xpath="@UndergradStudentOID"/> </xs:keyref> <xs:keyref name="GradStudentOID-Keyref" refer="StudentOID-Key"> <xs:selector xpath="./GradStudents/GradStudent"/> <xs:field xpath="@GradStudentOID"/> </xs:keyref>
XML Schema C- XML
Algorithm Overview • Generate object sets for each element, attribute • Specify built-in data types and simple types in the data frame • XML parent-child connections become binary relationship sets • minOccurs, maxOccurs, and use become participation constraints
Observation on Transformations • Our transformations to and from C-XML are not inverses of one another • However, C-XML XML Schema XML Schema C-XML
C-XML is More Expressive than XML Schema Extra, unneeded sequence structure <xs:all> <xs:element name=“e1” maxOccurs=“3”> <xs:sequence> … </xs:sequence> <xs:choice> … </xs:choice> </xs:all>
C-XML is More Expressive than XML Schema Extra, unneeded sequence structure <xs:all> <xs:element name=“e1” maxOccurs=“3”> <xs:sequence> … </xs:sequence> <xs:choice> … </xs:choice> </xs:all>
C-XML is More Expressive than XML Schema Extra, unneeded sequence structure <xs:sequence> <xs:element name=“e1” maxOccurs=“3”> <xs:sequence> … </xs:sequence> <xs:choice> … </xs:choice> </xs:sequence>
C-XML is More Expressive than XML Schema Generalization/specialization constraints <!--C-XML:forall x (StudentOID(x) => UndergradStudentOID(x) or GradStudentOID(x))--> <!--C-XML:forall x (UndergradStudentOID(x) => not GradStudentOID(x))--> <!--C-XML:forall x (Instructor(x) and Advisor(x)=> InstructorAdvisor(x))-->
C-XML is More Expressive than XML Schema Participation constraints for child elements <!-- C-XML: forall x (State(x) => exists[0:*] y(State(x) has Order(2) in Sequence-k(y))) -->
Recommendations • Extending XML Schema • Extend the all structure • Support generalization/specialization constraints
XML Schema is More Expressive than Traditional Conceptual Models • Traditional conceptual model languages do not support: • Sequence structure • Choice structure • Mixed-content • Any and anyAttribute structures
Recommendations • Enrich conceptual modeling languages • Order lists of concepts • Choose alternative from among several • Specify mixed content • Use content from another data model
Conclusions • Extended conceptual modeling for XML • Developed transformation algorithms: • C-XML to XML Schema • XML Schema to C-XML • Explored the equivalence of C-XML and XML Schema • Basic transformations are not inverses • But inverse transformations exist • Observations and Insights • Expressive Power • Recommendations
Future Work • Provide mathematical proofs • Make the prototype tool practical • Continue with this work in several activities in system analysis, design, development, and evolution • XML database design and development • Reverse Engineering • Integration