1 / 54

An Introduction to Track 4: SOA and Metadata (Semantics)

An Introduction to Track 4: SOA and Metadata (Semantics). 2 nd SOA for E-Government Conference 30-31 October 20006. Chuck Mosher Senior Enterprise Architect cmosher @ metamatrix.com. Agenda. The drivers for data (& metadata) integration Metadata in an SOA

luana
Download Presentation

An Introduction to Track 4: SOA and Metadata (Semantics)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An Introduction to Track 4: SOA and Metadata (Semantics) 2nd SOA for E-Government Conference 30-31 October 20006 Chuck Mosher Senior Enterprise Architect cmosher @ metamatrix.com

  2. Agenda • The drivers for data (& metadata) integration • Metadata in an SOA • Data services: using active metadata to drive data integration • Beyond metadata: dictionaries, vocabularies, domain models, ontologies (semantics) • Why ontologies? • Overview of Track 4 Presentations • Q & A

  3. Acknowledgements • Dave McComb*, Semantic Arts • Atif Kureishy*, Booz | Allen | Hamilton • John Salasin*, NIST • Jeff Pollock, Oracle • Brand Niemann, EPA • Andy Evans, Revelytix * Track 4 Speaker, 2:45-4:15 pm tomorrow

  4. Data Interoperability Lies At The Very Core of DoD Transformation One of the three enablers which drives domain-wide visibility: “… is a standard enterprise data architecture — the foundation for effective and rapid data transfer and the fundamental building block to enable a common logistical picture.” Army Lt. Gen. Claude Christianson “If you look at all the trends in the IT arena over the past 30 to 40 years, we’ve moved into an environment where we’ve got faster networks, more powerful processors, but it really comes down to the data” Michael Todd, DOD CIO office

  5. Dr. Linton Wells, as quoted in September’s NDIA Magazine, “…data compatibility may be an issue. Enabling digital interaction with nontraditional partners may require middleware or other programs that convert data from totally different formats …”

  6. Problem Scope • Incompatible data meanings are the largest, most expensive, and time-consuming portion of IT visibility and IT interoperability projects: • Gartner… Forrester… NIST… • IDC… CIO Magazine… • The classic “n-squared” problem of interfaces is even more severe at the data layer: • Data-to-data interfaces outnumber “pipes” • Tightly-coupled is brittle, and requires code • Information growth is accelerating – FAST! • 2002-2005 – more new data than all of history • 5 exabytes of new digital data created in 2002 – enough for .5 million new Library’s of Congress Jeff Pollock – 2004 White House Conference on Semantic Technology

  7. Agenda • The drivers for data (& metadata) integration • Metadata in an SOA • Data services: using active metadata to drive data integration • Beyond metadata: dictionaries, vocabularies, domain models, ontologies (semantics) • Why ontologies? • Overview of Track 4 Presentations • Q & A

  8. Why Does SOA Need Metadata? • An architectural style enabling loose-coupling • Cornerstone of E-Government reengineering • Web Services and their related standards (SOAP, WSDL, UDDI) provide an implementation framework for several key features of SOA • BUT: Web Service technologies do not provide all the requirements for Dynamic USE of Discoverable Services • Discovery – Yes – UDDI/ebXML • Use – No – requires service consumers and providers to agree on a pre-defined standard interface for the service

  9. SOA is Easy, It’s Metadata That’s Hard • SOA focuses on the interoperability between application interfaces & protocols • Data (and service) meaning, integrity, and transformation have to be addressed elsewhere • This information is found in the metadata • SOA makes getting control over the metadata critical to success • Or you will end up with SOA silos!

  10. Integration Syntactic Semantic Application Process Accessibility Visibility Discoverability Management Governance Auditing Lineage Quality Compliance Change Mgmnt Impact Analysis Performance Metadata Is Everywhere Many of the problems & issues around SOA implementations & governance boil down to getting a solid handle on all of the types & forms of metadata involved

  11. What Are Semantic Conflicts? Data Type Labeling Aggregation Structure Cardinality Generalization Value Representation Impedance Mismatch Naming Scaling and Unit Confounding Domain Integrity Different primitive or abstract types for same information Synonyms/antonyms have different text labels Different conceptions about the relationships among concepts in similar data sets. Collections or constraints have been modeled differently for same information Different abstractions are used to model same domain Different choices are made about what concepts are made explicit Fundamentally different data representations are used Synonyms/antonyms exist in same/similar concept instance values Different units of measures with incompatible scales Similar concepts with different definitions Fundamental incompatibilities in underlying domains Disparity among the integrity constraints Jeff Pollock – 2004 White House Conference on Semantic Technology

  12. Metadata Management Maturity • Level 1: Inventory of information assets • Necessary 1st step – what data do we have • Typically stored in repositories, registries, spreadsheets, implicit in data itself (relational DB’s) • Level 2: Impact analysis • Develop domain vocabularies and data models • Discover or create relationships between system artifacts • Level 3: Metadata-driven integration • Design-time metadata repository + run-time integration • Example of Model-Driven Architecture • Level 4: Semantic Web • Dynamic, machine-based inferencing at the concept level

  13. Data Evolution Timeline GIGO/minis/micros www / Netscape Web services OWL Age of Proprietary Data Age of Semantic Models Age of Programs Age of Open Data Age of Open Metadata Program-Data Text, Office Docs Databases (proprietary schema) HTML, XML (open schema) Namespaces, Taxonomies, RDF Ontologies & Inference 1945 -1970 1970 - 1994 1994 - 2000 2000 - 2003 2003 - Procedural Programming Object-Oriented Programming Model-Driven Programming “Data is less important than code” “Data is as important as code” “Data is more important than code” Michael Daconta, Creating Relevance and Reuse with Targeted Semantics, XML 2004 Conference Keynote, November 16, 2004.

  14. Agenda • The drivers for data (& metadata) integration • Metadata in an SOA • Data services: using active metadata to drive data integration • Beyond metadata: dictionaries, vocabularies, domain models, ontologies (semantics) • Why ontologies? • Overview of Track 4 Presentations • Q & A

  15. Information Challenges Communities of Interest Agency Challenges • 100’s/1000’s of data sources • 100’s/1000’s of applications • Multiple access points/modes for apps • Understanding relationships/semantics • Data consistency • Data reuse – bridging data silos • Support for Web Services & SQL • Control & manageability, compliance • Security & auditing ? Information Resources Program Challenges • Multiple sources • Different interfaces/drivers • Different physical structures • Different semantics • Single interface to data desired • Real-time access to data • Performance • Maintainability as data changes • Maintainability as apps change Mission Challenges • Time-to-deploy • Agility - Responsiveness to change • Automation – Reduce cost of new development and operations • ROI of enterprise information

  16. Information Virtualization Communities of Interest Information Virtualization Layer Information Resources

  17. Information Virtualization Information Virtualization Layer Unification of different concepts across systems Unified Semantic Layer Single-query access to heterogeneous systems Data Federation Layer Data Access/Connectivity Layer Uniform, standardized access to any system Enterprise Data Sources

  18. Metadata-Based Data Service • Decouple data sources from application • Data implementation shielded from application • Semantic/Format Mediation • Standard vocabulary • Single access point • Web Service/XML • SQL • Federation • Single source or multi-source • Scalability • Security, performance XML/SOAP SQL Bridge the Gap Data Service SQL SQL API Call Master Data Agency Application Operational Data Store

  19. FEA DRM View on Data Services DRM Version 2 Data Access Services • Context Awareness Services • Structural Awareness Services • Transactional Services • Data Query Services • Content Search and Discovery Services • Retrieval Services • Subscription Services • Notification Services Service Types include: • Metadata / Data • Structured / Unstructured • Read / Write • Push / Pull

  20. Modeling Information Services for SOA SOAP ODBC JDBC <sale/> <value/> </ sale > <WSDL> (contract) <WSDL> (contract) <WSDL> (contract) Designing data services Exposed Data Services Reusable, Integrated Data Objects Enterprise Information Sources (EIS) Information Consumers Web Services,Business Processes services warehouses EAI, Data warehouses databases Logistics Packaged Apps spreadsheets xml Custom Apps geo-spatial Reporting, Analytics Intelligence rich media …

  21. Data Service Abstraction Layers • Transformations from one or more sources • Transformations defined with: • Joins/unions • Criteria • Functions • Elements mapped to dictionary • Business definitions captured

  22. Data Service Layer in SOA Client Process & Applications App App App App App App Business Process Services Business Services Message Services (ESB) Data Service Data Service Data Service Data Service Data Service Data Services Layer Data Sources

  23. Data Services Approaches <X> <X> <X> <X> <X> <X> <X> <X> <X> <X> <X> <X> <X> <X> <X> <X> <X> <X> <X> <X> <X> <X> <X> <X> <X> <X> <X> <X> </X> </X> </X> </X> </X> </X> </X> </X> </X> </X> </X> </X> </X> </X> Data Services for Multiple Purposes: • Simplified access to value-added (tagged) data in real-time • Value-added (tagged) data materialized & staged • Phased-in migration from legacy to new • Managed archiving via classification, retention tags • Enhanced search via consistent content tags Agile Information Services Model-Driven Integration Layer Logical Data Model Logical Data Model T Org, Person, Image, Location T Organization, Customer, Imagery, Location Materialized Logical Model Materialized Logical Model Data, Content Sources Data, Content Sources Enriched Data/Content Store

  24. Leveraging COI Data Dictionaries Location_ID Location_Type bldg_type bldg_id Depot_Number SITENUM Facility_ID Business Intelligence Applications Search Applications Web Services ODBC/JDBC JDBC SOAP Application views of information: • Relational, XML XML Document <a> … <b> </b> </a> T T T C2, Logistics, Intelligence, … Logical Data Model: • Agency or COI-specific • Rationalize, harmonize, mediate T T T Authoritative Sources: • Mapped to logical Multiple Internal/External Information Sources

  25. Agenda • The drivers for data (& metadata) integration • Metadata in an SOA • Data services: using active metadata to drive data integration • Beyond metadata: dictionaries, vocabularies, domain models, ontologies (semantics) • Why ontologies? • Overview of Track 4 Presentations • Q & A

  26. Beyond Mere Metadata • Vocabularies/lexicons, Domain Models, Taxonomies, Ontologies • All are means of beginning to define the context and scope of the domain of interest • All specify artifacts in some way • The “Semantics” word often means the relationships between artifacts is also specified

  27. Semantics = Meaning = Relationships • Humans (and therefore our machines) only ever understand anything in so far as it is related to other things ID

  28. Semantics = Meaning = Relationships • Humans (and therefore our machines) only ever understand anything in so far as it is related to other things VA NY ID MD

  29. Semantics = Meaning = Relationships • Humans (and therefore our machines) only ever understand anything in so far as it is related to other things SUPEREGO EGO ID ANALYSIS

  30. Semantics = Meaning = Relationships • Humans (and therefore our machines) only ever understand anything in so far as it is related to other things LICENSE CARD ID BADGE

  31. Data Dictionary -> Vocabulary • The data alone does not have sufficient context • Using metadata is not enough - you must be able to leverage domain concepts and terminologies • Example problem – potentially similar data elements, but dissimilar constructs/datatypes/descriptions • How do we relate common constructs with uncommon datatypes? • Solution requires that vocabulary relate those constructs across models with transformation relationships, logic • Define business use/semantics of similar information • Datatypes describe a set of values • Defines the technical constraints on values • Enables integrating information, as datatypes can be referenced by any models (relational, XML, object, …)

  32. Benefits of Building a Vocabulary • Develop reusable information models and schemas • Capture business and technology requirements in a single vocabulary • Capture institutional knowledge • Enables semantic mining techniques for deeper data discovery and information sharing • Accelerate interoperability, web services and SOA development and deployment • Establish and maintain a common relationship across data sources • Establish and maintain compliance with industry exchange models • Reduce IT expenses by leveraging data in its native source • Reduce IT expenses associated with building and maintaining partner integration • Improved information sharing directly enhances decision making

  33. Example Vocabulary Development Process MDA DS COI Pilot - John Shea PEO C4I, PMW180 ISR/IO NMCI Auto Generate XSD - XML Develop UML Use-Case Class Relationship Diagram Determine Pilot Demonstration Vocabulary Handbook UNCLASSIFIED

  34. Agenda • The drivers for data (& metadata) integration • Metadata in an SOA • Data services: using active metadata to drive data integration • Beyond metadata: dictionaries, vocabularies, domain models, ontologies (semantics) • Why ontologies? • Overview of Track 4 Presentations • Q & A

  35. “Ideal” Semantics • Formal definition of meaning • Unambiguous • Machine process-able • Decidable • Automated classification • Membership based on properties • Inference • Can increase what you know based on classification

  36. Ontologies • Ontology is an explicit formal specification of the terms in a domain and the relationships between them • Others are special cases • Formal conceptual model • W3C standard (OWL/RDF) implementation • Concepts, definitions, properties, relationships • Machines can draw inferences from the properties and relationships captured in the model

  37. Ontologies • Ontologies bring rigorous definitions of meaning to (meta)data • More abstraction from lower levels of detail • Key to loose-coupling • With OWL/RDF, part of the W3C Semantic Web vision

  38. W3C Semantic Web Stack

  39. RDF • Resource Description Format • A mechanism to make assertions about things • In the form of a triple: subject -> predicate ->object Resource (URI) -> Property (URI) -> Resource (URI or literal) • URI’s establish unique namespace; do not have to be addressable

  40. RDF Examples “ORD” name Airport123 closestTo Business345 Airport123 “Chicago, IL” locatedIn Airport123

  41. OWL • OWL extends RDF by allowing us to create and make assertions about classes of things has Hair Mammal is a has Retractable Claws Feline

  42. Semantic Mapping Challenge Location_ID Location_Type bldg_type bldg_id Depot_Number SITENUM Facility_ID Business Intelligence Applications Search Applications Web Services ODBC/JDBC JDBC SOAP Application views of information: • Relational, XML XML Document <a> … <b> </b> </a> T T T C2, Logistics, Intelligence, … Logical Data Model: • Agency or COI-specific • Rationalize, harmonize, mediate T T T Authoritative Sources: • Mapped to logical Multiple Internal/External Information Sources

  43. Contextualize (Interpret) ArticleAmount Amount Article Synonym Creation Sum Type-of Assets Automated term tokenization Automated semantic linking using the default knowledge-base contained within MatchIT

  44. Semantic Matching (Mediate) • With relationships pre-established within the knowledge-base… • Identify the Target and the Source(s) and run the match. ArticleAmount Automatically linked by a specific % distance ProductShares

  45. Facilitate Decision Making (Mediate) Target element for matching Automatically calculated semantic distance between terms Helps facilitate rapid decision making Source candidate for matching

  46. Integration Driven By Semantics Ontology Models (e.g. OWL, RDF) XML XML XML Relate information in different domains/models Search within and across domains for related information Enterprise Model (UML) Model & Relate information within any domain Data Models (Relational, XML) Physical Sources

  47. Ontology-Driven Integration Example equivalence equivalence equivalence equivalence Logical Views Ontology Physical Sources Transportation T Land T 4 Wheel 2 Wheel T Bus Truck Car T Cargo Truck Fuel Truck

  48. Agenda • The drivers for data (& metadata) integration • Metadata in an SOA • Data services: using active metadata to drive data integration • Beyond metadata: dictionaries, vocabularies, domain models, ontologies (semantics) • Why ontologies? • Overview of Track 4 Presentations • Q & A

  49. Track 4 Talks Tomorrow: 2:45-4:15pm • Predictive Metrics To Guide SOA-Based System Development • John Salasin, NIST • Integrating SOA and Ontologies for Information Sharing • Atif Kureishy, BAH • SOA & Semantics • Dave McComb, Semantic Arts

  50. Predictive Metrics To Guide SOA Development John Salasin, NIST • Will propose a set of metrics (vocabulary) to characterize SOA-based systems • These metrics can be assessed at different points in the development lifecycle • Early stage (concept development) • Architecture/Construction (system charac.) • Operations (robustness, perf, usage, govern.) • Evolution (extensibility, change mgmnt) • Analysis can lead to ongoing refinement at every stage • Quantitative, incremental Verification &Validation

More Related