100 likes | 217 Views
Methodologies and techniques for translating information from source to target data models. Unità Responsabile: CS-RC. Unità Coinvolte: CS-RC. D2I Modena, 27 Aprile 2001. Synthesis.
E N D
Methodologies and techniques for translating information from source to target data models Unità Responsabile: CS-RC Unità Coinvolte: CS-RC D2I Modena, 27 Aprile 2001
Synthesis • SDR-Network: a new conceptual model for representing information sources having different formats and structures • The SDR-Network is a rooted labeled graph: Net(IS) = < NS(IS), AS(IS) > • NS(IS) represents the set of nodes; each node is characterized by a name • AS(IS) denotes a set of arcs; each arc can be represented by a triplet < S, T, LST > • S is the source node • T is the target node • LST = [dST, rST] is a label associated to the arc
Synthesis • dST is the semantic distance coefficient: • it indicates how much the concept expressed by T is semantically close to the concept expressed by S • this depends from the capability of the concept associated to T to characterize the concept associated to S • rST is the semantic relevance coefficient: it denotes the fraction of instances of the concept denoted by S whose complete definition requires at least one instance of the concept denoted by T
Synthesis • A suitable metrics can be defined based on the SDR-Network for measuring the strength of the semantic relationships holding among concepts of thesame information source • The Path Semantic Distance PSDP of a path P in Net(D) is the sum of the semantic distance coefficients associated to the arcs constituting the path • The Path Semantic Relevance PSRP of a path P in Net(D) is the product of the semantic relevance coefficients associated to the arcs constituting the path
Synthesis • The CD-Shortest-Path (Conditional D-Shortest-Path) between two nodes N and N’ in Net(D) and including an arc A (denoted by N, N’ A) is the path having the minimum Path Semantic Distance among those connecting N and N’ and including A • A D-Pathn is a path P in Net(D) such that n PSDP < n+1 • The i-th neighborhood of an SDR-Network node x is:
Synthesis An SDR-Network relative to a University
Synthesis • nbh(student,0) = { < student, id, [0,1] >, < student, name, [0,1] >, < student, birthdate, [0,1] >, < student, birthplace, [0,0.75] >, < student, enrollment_year, [0,0.75] >, < student, address, [0.25,1] > } • nbh(student,1) = { < student, course, [1,1] >, < course, id, [0,1]>, < course, name, [0,1] >, < course, year, [0.25,0.66] >, < course, n_students, [0.25,0.33] >, < course, argument, [0.5,1] >, < student, exam, [1,0.75] >, < exam, id, [0,1] >, < exam, date, [0.25,1] >, < exam, grade, [0.25,1] >, < student, thesis, [1,0.25] >, < thesis, title, [0.25,1] >, < thesis, subject, [0.25,1] > }
Synthesis • Suitable algorithms can be derived which exploit the SDR-Network for extracting terminological and structural relationships among concepts belonging to different information sources • Any source basically reduces to the representation of a set of concepts and a set of relationships among concepts • Using nodes and arcs in an SDR-Network we are able to represent both these sets and, therefore, to model any given source • In addition, semantic distance and relevance coefficients allow to describe also some implicit intra-source semantics and possibly to derive inter-source semantics as well.
Synthesis • We have defined translation rules for obtaining an SDR-Network from: • an XML document • an OEM-Graph • an E/R scheme • However, we argue that analogous translation rules can be defined from almost all conceptual models proposed in the literature for representing semi-structured information source to SDR-Network • We have compared the features of the SDR-Network w.r.t. those relative to some other conceptual models proposed in the Literature
Open Problems and Future Work • In the future we plan to: • complete the definition of techniques for both reconstructing the relative semantics and obtaining a global view of a group of information sources represented by the corresponding SDR-Networks • exploit the semantics derived with the support of the SDR-Network for: • E-commerce • Semantic Query Processing • Data and Web Warehouses • Advanced Web Search Engines