1 / 43

Enterprise Information Integration using Semantic Web Technologies: RDF as the Lingua Franca

This presentation discusses the challenges of enterprise information integration and explains how RDF (Resource Description Framework) can serve as a lingua franca for information exchange. It explores the benefits of focusing on semantics, easier data integration, bridging other formats/models, and looser coupling. The presentation also presents a proof of concept for a SPARQL adaptor for UCMDB (Universal Configuration Management Database).

mbryan
Download Presentation

Enterprise Information Integration using Semantic Web Technologies: RDF as the Lingua Franca

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Enterprise Information Integration using Semantic Web Technologies:RDF as the Lingua Franca David Booth, Ph.D. HP Software Semantic Technology Conference 20-May-2008 In collaboration with Steve Battle, HP Labs Latest version of these slides:http://dbooth.org/2008/stc/slides.ppt

  2. Disclaimer This work reflects research and is presented for discussion purposes only. No product commitment whatsoever is expressed or implied. Furthermore, views expressed herein are those of the author and do not necessarily reflect those of HP.

  3. Outline • PART 0: The problem • PART 1: RDF: The lingua franca for information exchange • Why • Focus on semantics • Easier data integration • Easier to bridge other formats/models • Looser coupling • How • RDF message semantics • REST-based SPARQL endpoints • XML with GRDDL transformations • Aggregators • PART 2: POC: A SPARQL adaptor for UCMDB • What is UCMDB • SPARQL adaptor

  4. PART 0The problem

  5. Discovery IncidentManagement Provisioning ChangeManagement ReleaseManagement Compliance Management Release Managers Operation Centers Monitoring Ticketing ComplianceManagers Source Control Networking Engineers StorageAdministrators Unix SystemAdministrators Windows SystemAdministrators NetworkingAdministrators Problem 1: Integration complexity • Multiple producers/consumers need to share data • Tight coupling hampers independent versioning

  6. Problem 2: Babelization • Proliferation of data models (XML schemas, etc.) • Parsing issues influence data models • No consistent semantics • Data chaos Tower of Babel, Abel Grimmer (1570-1619)

  7. PART 1RDF: The lingua franca for information exchange

  8. Why? Four reasons . . .

  9. Why?1. Focus on semantics • XML: • Schema is focused on how to serialize • Constrains more than the model • Parent/child and sibling relationships are not named • Are their semantics documented? E.g., does sibling order matter? • RDF: • One URI per concept • Syntax independent • Who cares about syntax?

  10. Why?2. Easier data integration

  11. Why?2. Easier data integration • Blue App has model

  12. Why?2. Easier data integration • Red App has model • Need to integrate Red & Blue models

  13. Why?2. Easier data integration • Step 1: Merge RDF • Same nodes (URIs) join automatically

  14. Why?2. Easier data integration • Step 2: Add relationships and rules • (Relationships are also RDF)

  15. Why?2. Easier data integration • Step 3: Define Green model • (Making use of Red& Blue models)

  16. Why?2. Easier data integration • What the Blue app sees: • No difference!

  17. Why?2. Easier data integration • What the Red app sees • No difference!

  18. Y A1 X A2 A3 RDFModelTransform B1 B2 C1 C2 Ontologies& Rules Ontologies& Rules Ontologies& Rules Z Why?3. RDF helps bridge other formats/models • Producers and consumers may use different formats/models • Rules can specify transformations • Inference engine finds path to desired result model

  19. Why?4. Looser coupling • Without breaking consumers: • Ontologies can be mixed and extended • Triples can be added • Producer & consumer can be versioned more independently

  20. Example of looser coupling • RedCust and GreenCust ontologies added • Blue app is not affected (Blue app) Consumer Producer

  21. How? Four ways . . .

  22. How?1. RDF message semantics • Interface contract specifies RDF, regardless of serialization • RDF pins the semantics RDF Consumer Producer

  23. How?2. REST-based SPARQL endpoints HTTP RDF Consumer Producer SPARQL

  24. REST-based SPARQL endpoints • Why REST: • HTTP is ubiquitous • Simpler than SOAP-based Web services (WS*) • Looser process coupling

  25. REST-based SPARQL endpoints • Why SPARQL: • One endpoint supports multiple data needs • Each consumer gets what it wants • Insulates consumers from internal model changes • Inferencing transforms data to consumer's desired model • Looser data coupling

  26. How?3. XML with GRDDL transformations • GRDDL is a W3C standard • GRDDL permits RDF to be "gleaned" from XML • XML document or schema specifies desired GRDDL transformation • GRDDL transformation produces RDF from XML document • Mostly intended for getting microformat and other data/metadata from HTML pages

  27. Using GRDDL for XML document semantics • Each XML format can be viewed as a custom serialization of RDF! • GRDDL transformation produces semantics of the XML document • Helps bridge XML and RDF worlds • Same XML document can be consumed by: • Legacy XML app • RDF app • App interface contract can specify RDF • Serializations can vary • Semantics are pinned by RDF

  28. Service Normalizeto RDF Core AppProcessing RDF Engine / Store Client Serialize asXML/other/RDF Using GRDDL for XML document semantics See: http://dbooth.org/2007/rdf-and-soa/rdf-and-soa-paper.htm

  29. Y Ontologies& Rules Ontologies& Rules Ontologies& Rules How?4. Aggregators • Gets data from multiple sources • Provides data to consumers A1 X A2 A3 Aggregator B1 SPARQL B2 C1 C2 Z

  30. Aggregator • Conceptual component • Not necessarily a separate physical service • Handles mechanics of getting data • Different adaptors for different sources • REST, WS*, Relational, XML, etc. • Diverse data models • Might do caching and query distribution (federation) • Provides model transformation • Plug in ontologies and inference rules as needed

  31. PART 2Proof-of-Concept: A SPARQL adaptor for UCMDB

  32. IT Service Management (ITSM) • Manage IT environment • Configuration Management Data Base (CMDB) is central

  33. The HP Universal CMDB (UCMDB) Goal: • Maintain a comprehensive and current record of all configuration items (CIs) and their relationships CMDB : Configuration Management DB

  34. http://cmdb.mercury.com#nt.35014541 One particular host machine Example: host information Host properties

  35. SPARQL adaptor • Uses existing SOAP interface to UCMDB • Enables SPARQL queries • Results can be RDF • No model transformation (yet) SPARQL adaptor SOAP interface HP UCMDB

  36. Architecture of SPARQL adaptor Run Time SPARQL adaptor TQL SPARQL compile submit SOAP interface HP UCMDB RDF lift XML Design Time Database CMDB metadata OWL export

  37. UCMDB ontology • The HP UCMDB ontology defines CI types and relationship hierarchies. • Derived automatically from HP UCMDB metadata.

  38. ARQ : Query Engine SPARQL queryalgebra, evaluation Joseki : SPARQL server SPARQL protocol Jena based implementation • Jena, ARQ, Joseki developed at HP Labs*. RDF, OWL, inference Jena : Semantic Web toolkit * http://www.hpl.hp.com/semweb/

  39. Query returning a table Select the names of host servers on the network with addresses from 192.168.81.0 SELECT ?host_name WHERE { [ a object:network ] attr:network_netaddr "192.168.81.0" ; link:member [ a object:host ; attr:host_dnsname ?host_name ] }

  40. Query returning an RDF subgraph Describe a network (192.168.81.0) with host servers containing a DB.

  41. Example RDF result set Database Host

  42. Outline • PART 0: The problem • PART 1: RDF: The lingua franca for information exchange • Why • Focus on semantics • Easier data integration • Easier to bridge other formats/models • Looser coupling • How • RDF message semantics • REST-based SPARQL endpoints • XML with GRDDL transformations • Aggregators • PART 2: POC: A SPARQL adaptor for UCMDB • What is UCMDB • SPARQL adaptor

  43. Questions?

More Related