180 likes | 225 Views
Sangam: A Transformation Modeling Framework. Kajal T. Claypool (U Mass Lowell) and Elke A. Rundensteiner (WPI). The Era of Electronic Information. Age of electronic information Data exists in many different formats Different data models Different schemas Users need to
E N D
Sangam: A Transformation Modeling Framework Kajal T. Claypool (U Mass Lowell) and Elke A. Rundensteiner (WPI)
The Era of Electronic Information • Age of electronic information • Data exists in many different formats • Different data models • Different schemas • Users need to • Publish data in many formats • Integrate and transform data • Query and expect results in common format • Underlying problem • Need to express mapping of data from one format to another • Need to perform transformation of data based on expressed mappings.
Schema Translation: State of the Art • Naïve approach [Zhang01,Shanmugasundram99] • Write specific programs to translate data from one format to another • Examples: • Algorithms: translate XML documents into relational data [zhang01,shanmugasundram99] • Latex2html: convert latex into HTML documents
Schema Translation: State of Art • Matching approach [milo98] • Automatically discover the semantic correspondences between two schemas • Generate translations based on discovered matches • Modeling approach [bernstein00,atzeni96] • Transform local schema into common data model • Translation language to express mappings between schemas in middle layer
The Sangam Framework • Goals: Flexible, extensible, and re-usable • Allow users to: • Explicitly model translations between schemas • Compose translations from an existing library of modeled translation patterns • Choose from a library of translation operators • Generate translation model from based on schema match process • For all modeled translations: transform the data based on translation
Overview of Sangam Framework Legend: System Input Pattern Interface User Input System generated output Transformation Framework Schema S1 ToolSet Tran I n t e r f a c e Transform- ation Patterns Displayed to User Data D1 Matches Matcher Transformation Model Schema S2 User feedback Data D2 Evaluator Transformed Schema Transformed Data
Outline • Sangam graphs • Cross algebra operators • Composition techniques • Cross algebra graphs • Execution strategies • Architecture • Conclusions
Cross Algebra for translation Sangam graph Export Import RDB XML Sangam Graphs • Sangam • Common data model: • Sangam graph model • Translation language • Algebra-based
Requirements for a Common Data Model • Graph-based • Common denominator for most data models • Expressiveness • Represent schemas from different data models • Fundamental constraints • Represent constraints such as quantifier, order and key constraints • Existing data models not completely suitable • Relational, and OO cannot represent order in clean manner • XML (older spec) can not represent key constraints
Sangam Graph Model • Satisfies requirements • Graph based • Based on SIGs[Miller93] • Can model schemas from different data models • Can represent quantifier, order, key and foreign key constraints • Graph • Nodes represent entities • Eg. Relation, attribute, element • Edge relationships between them • Eg. Containment relationship between relation and attribute
Example: Sangam Graph <!ELEMENT item (location, mailbox, name)> <!ATTLIST item id ID #REQUIRED featured CDATA #IMPLIED> <!ELEMENT location (#PCDATA)> <!ELEMENT mailbox (mail*)> <!ATTLIST mailbox id CDATA> <!ELEMENT mail (from, to, date)> <!ELEMENT from (#PCDATA)> <!ELEMENT to (#PCDATA)> <!ELEMENT date (#PCDATA)> <!ELEMENT name (firstName, lastName)> <!ELEMENT firstName (#PCDATA)> <!ELEMENT lastName (#PCDATA)>
Cross Algebra • Requirements for a transformation language • Node and edge manipulations • Minimal granularity • Eg: Relation has name and attributes • Allow composition • Unique contribution: algebra-based translation language • Translate from one Sangam graph to another • Four Operators: • Represent core set of graph linear transformations [GBook00] • Can be composed to formulate more complex operations such as a join operation – not our focus
Cross Algebra Operators • cross, connect, smooth and subdivide
Composition of Operators • Context Dependency • Output = union of output of all operators • Derivation • Output = output of root operator
Evaluating a Cross Algebra Graph FunctionEvaluateCAT (input: Operator op, Sangam Graph G, output: Sangam Graph G’) { if (!op.hasChildren ()) G’ p.evaluate (G, G’) op.markDone () out G’ // cached local output return G’ while (op.hasChildren()) { operator opC op.getNextChild () if (e:<op, opC> == derivation) G_local EvaluateCAT (opC, G, G’) G’ op.evaluate (G_local, G’) op.markDone () out G’ // cached local output return G’ elseif (e:<op, opC> == contextdependency) G_local EvaluateCAT (opC, G, G’) G’_local op.evaluate (G, G’) G’ G_local U G’_local op.markDone () out G’_local // local cached output return G’
Conclusions • Sangam: Flexible, extensible, and re-usable transformation modeling framework • Key contributions: • Concept of Sangam • Cross Algebra • An algebra for modeling linear transformations • Composition techniques to deal with finer granularity • Evaluation techniques • Future work • Modeling the data model layer • Optimization of evaluation strategies • Non-linear transformations