210 likes | 361 Views
First IEEE International Conference on Digital Information Management (ICDIM). A Unified Framework for the Semantic Integration of XML Databases. Doan Dai Duong and Le Thi Thu Thuy {Duong_Dai.Doan, Thuy_Thi_Thu.Le}@unb.ca The University of New Brunswick, Fredericton, NB, Canada.
E N D
First IEEE International Conference on Digital Information Management (ICDIM) A Unified Framework for the Semantic Integration of XML Databases Doan Dai Duong and Le Thi Thu Thuy {Duong_Dai.Doan, Thuy_Thi_Thu.Le}@unb.caThe University of New Brunswick, Fredericton, NB, Canada Presented by Virendrakumar C. Bhavsar December 06-08, 2006
Agenda • Introduction • XML Declarative Description (XDD) • Modeling of Data Components • Modelling of Processing Components • Conclusion
XML schema1 Integrated XML schema XML Database Schema Integration System XML schema2 RDS Set of mappings XML schemaN OODS convert Ontology RDS Introduction • General model of XML database integration Step 1: Schema Integration
<studenewrrwerr"> • <Fname>> • <room/rrrrrrrrrrr> • <national/rrewe> • </studeerewrewnt> s s s Integrated data Local data Local data Local data xxx n n n c c c r r r <student source="B"> <Fname> Xuan</Fname> <room>G26</room> <nationality>Vietnam</nationality> </student> <student source="B"> <Fname>Phuoc</Fname> <room>A12</room> <nationality>Campuchia</nationality> </student> <student source=“C"> <Fname> Xuan</Fname> <room>G26</room> <nationality>Vietnam</nationality> </student> <student source="B"> <Fname>Phuoc</Fname> <room>A12</room> <nationality>Campuchia</nationality> </student> <student source=“A"> <Fname> Xuan</Fname> <room>G26</room> <nationality>Vietnam</nationality> </student> <student source="B"> <Fname>Phuoc</Fname> <room>A12</room> <nationality>Campuchia</nationality> </student> xx xx xx query • <Fname>> • <national/> • </student> Integrated schema x x Step 2: Query Processing Users 4
XDD as underlying model Integrated schema XMLSchema Database sources Integrated data XML database Integration system Metadata XML data User query XML query Proposed Integration Framework Powerful • XDD supports for all tasks of framework • Input XML query, input XML data, output XML data • Rules, constraints, mappings • Metadata • Based on XML standard format, XDD combines all tasks of framework tightly and makes it easily to manipulate data • Reduce time and effort of programmers and users and syntax errors 5
XML Declarative Description* • XML Declarative Description (XDD)is XML-based information representation • Ordinary XML expressions (ground XML expressions)+ variables = Non-ground XML expressions Enhancement of expressive power and representation of implicit information • XML clauses of the form H ← B1, … , Bm, C1, …, Cn Able to express conditions, constraints *Wuwongse, V., Anutariya, C., Akama, K., and Nantajeewarawat, E. XML Declarative Description (XDD): A Language for the Semantic Web. IEEE Intelligent Systems, Vol. 16, No. 3, (2001) 54-65
Modeling of Data Components • XML Databases • Extension (actual data values): ground XML expressions • Intension (schemas, logical specifications, relationships, indexes and constraints): non-ground XML expressions • XML Queries • Include constructor, patterns, and filters • Correspond to three parts (H, Bi, Cj) of XDD rule H B1 …, Bm, C1,…,Cn
constructor pattern filter Modeling of Data Components Query modelled by XDD
Query Execution Example Data source <Student> <name>John</name> <nationality>Canadian </nationality> <GPA>4</GPA> <phone>234-7856<phone> <ID>3224567<ID> </Student> <Student> <name>Duong</name> <nationality>Vietnamese </nationality> <GPA>4.2</GPA> <phone>456-3241<phone> </Student> Query result1 result2 <Student> <name>John</name> <nationality>Canadian </nationality> <GPA>4</GPA> </Student> <Student> <name>Duong</name> <nationality>Vietnamese </nationality> <GPA>4.2</GPA> </Student>
Modeling of Data Components • Mappings • Describes correspondence between object in integrated schema and its corresponding objects in local schemas • Supports decomposing XML queries and converting data • Modeled by non-ground XML expressions
Sample of Mappings Object in integrated schema Object in schema A Object in schema B
Modelling of Processing Components • Schema Integration Component • The main task is to resolve conflicts between schemas of participating databases • Conflict resolution between various schemas is done at one time (one-shot strategy) • Each local schema is big non-ground XML expression ($E_variable)
<Integrating_schema> <schema name="1">…</schema> <schema name="2">…</schema> … <schema name="n">…</schema> </Integrating_schema> <schema name="1"> </schema> $E expression <schema name="2"> </schema> $E expression $E expression <schema name="n"> </schema> Schema Integration Component • XDD can interactively process all schemas as $E expressions
Schema Conflict Classification • Naming conflicts • Synonyms • Acronyms • Homonyms • Structural conflicts • Missing items conflicts • Internal path discrepancy conflicts • Aggregation conflicts • Generalization/specification Conflicts between schemas can be classified into four main kinds • Constraint conflicts • Occurring numbers of elements • Fixedvs. default values • Constraints of attributes • Data type conflicts • Disjoint or incompatible data types • Compatible data types • IDREF and IDREFS
Union rule Professor FName MName LName Name Aggregation checking and data type constructing rule New data type is created Professor Professor Professor Name FName MName LName Name FName MName LName Aggregation conflict 14
student country position name field id SATstudent key country fullname fieldStudy SOMstudent nation position name program id Query Decomposition • The main task yield n local subqueries from global query • <student id =“$S:id”> • <name>$S:name</name> • <country>$S:country</country> • </student> Integrated schema • <SATstudent key =”$S:id” source=”B”> • <fullname> $S:name </fullname> • <country>$S:country</country> • </SATstudent> Schema for source B • <SOMstudent id=”$S:id” source=”A”> • <name> $S:name </name> • <nation>$S:country</nation> • </SOMstudent> Schema for source A
XML metadata • XDD rules for • transformation • <SOMstudent id =”$S:id” source=”A”> • <nation>$S:country</ nation> • </SOMstudent> <name> $S:name</name> • <SATstudent key =”$S:id” source=”B”> • <country>$S:country</country> • </SATstudent> • <fullname> $S:name </fullname> Output XML queries Query Decomposition Mappings from global to local A. Brief view Sub query for local source query Query Decomposition Sub query for local source B. Solution Input XML query • <student id =”$S:id”> • <name>$S:name</name> • <country>$S:country</country> • </student> 16
Query Decomposition Example • <answer> • <SATstudent source=”B”> • <country>$S:country</country> • </SATstudent> • <SOMstudent source=”A”> • <nation>$S:country</nation> • </SOMstudent> • </answer> Local query for source A results in 4 Local query for source B • <answer> • $E:expression • </answer> • • <Mapping> • <student> • <country>$S:country</country> • </student> • <local>$E:expression</local> • </Mapping> 3 infers to matches with 1 <Mapping> <student> <country>$S:country</country> </student> <local> <SATstudent source=“B"> <country>$S:country</country> </SATstudent> <SOMstudent source=“A"> <nation>$S:country</nation> </SOMstudent> </local> </Mapping> bounds to 2
Query Decomposition • Using special structure of mapping and applying XDD rules for query decomposition • Subqueries for distributed data sources are simultaneously produced • Similarly for data conversion, extracted data are simultaneously converted to global schema format
Conclusion • XDD is used to model all data components and processing components of XML database integration framework • Components of system modeled by XDD can communicate and exchange data easily • Special structure for XDD-based bidirectional mappings is designed. Information is produced efficiently for both query decomposition and data conversion, avoiding data redundancy • The framework can • Integrate n participating schemas • Decompose a query into n subqueries at a time.