1 / 24

Agenda

Agenda . Introduction Schema Matching in General Why How Who MOMIS Architecture The Global Schema Common Thesaurus Clustering of Elements Construction of the Global Schema. Introduction. Industrial Electronics and Automatic Control TU-Wien (1983-1988) with Siemens since 1988

patrice
Download Presentation

Agenda

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Agenda • Introduction • Schema Matching in General • Why • How • Who • MOMIS • Architecture • The Global Schema • Common Thesaurus • Clustering of Elements • Construction of the Global Schema

  2. Introduction • Industrial Electronics and Automatic Control TU-Wien (1983-1988) • with Siemens since 1988 • Expertsystem Prototypes for Payload Monitoring • SW for Power Utilities • Datamodel for Hydroscheduling • Loadforcast • Cargo-Billing System for the ÖBB • Customizing Product Lifecycle Management System • Thesis on „Schema Matching“ • rainer.dobiasch@siemens.com

  3. Schema Matching in General - Why • Enterprise Integration • Billing System • Customer Care System • Intranet • Webdata extraction • .....

  4. Schema Matching in General - How • (Analyse Data to generate Schemas) • Try to find „affinity“ between the Elements of the involved Schemas • Lexical Affinity (Acronyms, Abbreviations, Derivations) • Structural Affinity • Refine „Affinity“ Values • (Generate a Global Schema) • Set up Rules for the Mapping • (Generate Wrappers for the Data Sources)

  5. Schema Matching in General - Who • MOMIS: Sonja Bergamaschi - University of Modena • ARTEMIS: Silvana Castano – University of Milano • Similarity Flooding: Sergey Melnik – Stanford University • Cupid: Philip A. Bernstein – Microsoft • COMA: Eduard Rahm – University of Leipzig • http://www.ifi.unizh.ch/stff/pziegler/IntegrationProjects.html

  6. MOMIS Mediator envirOnment for Multiple Information Sources • ARTEMIS Analysis and Reconcilation Tool Environment for Multiple Information Sources

  7. MOMIS Architecture Artemis Mediator User Global Schema Builder ODB-Tools Query Manager ODLI3 Wrapper Wrapper Wrapper FileSystem DB

  8. Global Schema Builder Architecture source schemata Global Schema Builder SLIM Source Lexical Integrator Module intensional inter-schema relationships WordNet intensional intra-schema relationships Designer SI-Designer SIM Source Integrator Module ODB-Tools inferred relationships clusters Common Thesaurus inferred relationships Global Schema & Mapping Tables Artemis

  9. SI-Designer Architecture Wrapper Local Schemata Local Schemata Acquisition Wrapper Intensional intra/inter schema relationships extraction SLIM SIM Acquisition of relations provided by the designer Designer Inferred and validated relationships ODB-Tools Common Thesaurus Global Clusters & Mapping Tables Generation Cluster Generation Artemis

  10. Schema Concept • Schemata: S = {S1, S2,... SN} • Si={e1i, e2i, ..., emi} • eji=< n(eji), SP(eji), DP(eji) > • P(eji) =SP(eji) DP(eji ) • pk P(eji) • pk = < npk, dpk, (mc, MC) pk > • dpk  PRE  dpk  REF • PRE = {integer, smallint,decimal, float, char[n]}

  11. Building the Global Schema • Common Thesaurus • Clustering of Elements • Construction of the Global Schema

  12. Building the Common Thesaurus • MOMIS makes use of WordNet • Wordnet contains • Synsets (Collection of Words associated to a meaning) • Relations between Synsets (BT/NT/RT) • Relations between Words • MOMIS+User assign schema elements to meanings • Term = < e, meaning> • Common Thesaurus • Terms • Associations from Wordnet • Derived Associations • User-Inserted Associations

  13. Affinity Calculation - Weights • Association Type Weights: 0 <  ≤ 1 • SYN ≤ BT/NT • SYN = 1 • BT/NT = 0.8

  14. Affinity Calculation – Name Affinity • Name Affinity • Name Affinity Coefficient

  15. Affinty Calculation - Structural Affinity - Correspondence

  16. Affinity Calculation – Structural Affinity – Affinity Classes • Properties grouped in Affinity Classes • Well-formed Affinity Classes

  17. Affinity Calculation – Structural Affinity

  18. Affinity Calculation – Global Affinity

  19. Clustering of Elements • Affinity Matrix • Initial Cluster for each element • Iteration till Dimension of Matrix = 1 • Search for clusters with highest Affinity • Merge the clusters • Compute new Affinity values

  20. Hierarchical Clustering Algorithm Let SE be the set of schema elements to be clustered Let k be the number of schema elements in SE 1. /* Compute k*(k-1)/2 Global Affinity coefficients For i:=1 to k do M[i,i] := 1 For j :=1 to k do M[j,i]:=M[i,j]:= GA(ei,ej) 2. Place each schema element ei into a cluster Cli 3. Repeat Select the pair of Clusters Cli, Cljof current clusters such that M[j,i] is max of all M Cli:= Cli Clj For l:=1 to k do If l ≠ j then M[l,i]:=M[i,l]:=max{M[l,i],M[l,j]} remove row and colum j from M k:= k-1 until k=1

  21. Affinity Tree

  22. Unification Rules • Name

  23. Unification Rules (cont) • Domain • Cardinality

  24. Tuning Parameters Summary • Weights for Associations SYN ≤ BT/NT • Threshold for Name Affinity  • Weight for Affinity Computation (NA<->SA) SA+ SA=1

More Related