100 likes | 114 Views
IDEAS – DM2 Support for Semantic Webs. 11 July 2011 DoDAF Team. Topics. Data Linking and Access Data Centric paradigm Beyond Linking and Access OWL-enabled Machine Reasoning Formal Ontologies Syntax and Semantics Requirements and Cost Realism. Data Linking and Access.
E N D
IDEAS – DM2 Support for Semantic Webs 11 July 2011 DoDAF Team
Topics • Data Linking and Access Data Centric paradigm • Beyond Linking and Access • OWL-enabled Machine Reasoning • Formal Ontologies • Syntax and Semantics • Requirements and Cost Realism
Data Linking and Access Linkability of datasets via URI's Old DBMS way was “keys” Basis for many RDBMS joins Works Managed URI's. Managed GUIDS Army EID Works -- reduces unintended redundant data and enables more accurate and complete queries. Web-published data files, whether in OWL or just plain XML, are usually more accessible than RDBMS' “Globally Unique Enterprise Identifiers (EID). An enterprise identifier guarantees a key that is unique enterprise-wide. The use of Enterprise IDs will ensure exact record-matching between heterogeneous databases even when the databases were designed independently. Establishing EIDs as the standard identifiers across the Army and DoD will improve battlespace resolution for Joint Task Force (JTF) commanders. EIDS will allow a horizontal interoperability (across Services) that does not currently exist.”
Beyond Linking and Access • The idea that you will link a bunch of data together on the web and that then unassociated citizens will build reasonsers that do beneficial things needs careful examination. • Limits of linking, e.g., William A. Woods well-known 1975 paper "What's in a Link: Foundations for Semantic Networks" – 1975! • If needed, IDEAS and DM2 can help in making the links semantically stronger since every "link" is supertyped to something mathematical*. *representedBy is not strictly a math dept. relationship
Merely encoding data in OWL is insufficient. For example: the Joint Consultation, Command and Control Information Exchange Data Model (JC3IEDM, STANAG 5525) in OWL JC3IEDM is a great Entity-Relationship model The OWL version: JC3IEDM 3.1a Entities extracted as Classes JC3IEDM 3.1a Relationships extracted as ObjectProperties. JC3IEDM 3.1a Attributes Extracted as DatatypeProperties and ObjectProperties JC3IEDM 3.1 Domain Codes Extracted as Enumeration Classes. I.e., only OWL syntax was applied! Conforming JC3IEDM to IDEAS would add meaning but would be a LOT of work When IDEAS Group analyzed CADM, it often took over a day to decode just a half-dozen domain values It could take many ontologists and JC3IEDM SME's years to decode the 1,000’s of data elements. OWL-enabled Machine Reasoning
Formal Ontologies • There is no line in the sand on what's called an ontology or not, e.g., UCORE Digest or NIEM or UDEF* or JC3IEDM or CADM or etc. could all call themselves ontologies. • And if you conformed to any of them, you would get the benefits of commonality. • But they will not advance you toward machine reasoning. • If you want to have automated reasoning, you will need a formal foundation • The reasons we like IDEAS is that the set theoretic and 4-D mereotopologic foundations could allow us to employ predicate calculus, first-order-logic, geometry, topology, etc. in reasoning over datasets. • But it will be a lot of work. • Use of IDEAS or, for that matter, any other formal ontology foundation, e.g., SUMO or BFO**. • DM2-IDEAS is a good choice just because • IDEAS accommodated SUMO and 15926 • IDEAS knows how BFO fits in IDEAS • DoDAF team made some DoD extensions to IDEAS to make DM2, • The DCIO and DCMO teams have been working together for years * Universal Data Element Framework, an Open Group project ** Basic Formal Ontology (BFO) is the Barry Smith one that Army used for UCORE Semantic Layer. Barry, Chris Partridge, and Matthew West are all old colleagues.
Syntax and Semantics • OWL can be argued as more syntactical than semantic, e.g., John Sowa's response to • "Safe upon the solid rock, the ugly houses stand.Come and see my shining castle, built upon the sand."-- Edna St. Vincent Millay • is, "the Semantic Web has built a huge castle of syntax with no foundation in semantics. SQL may be ugly, but it is based on FOL, and it successfully runs the world economy. Instead of going with SQL, the DL crowd chose the sand." • But then there's Roger Schank's famous quip, "There is no difference between syntax and semantics". • Is the "Semantic Web" really a "Syntactic Web" or maybe just, "the Web". • After all, Google does a lot of linking with JSON. • The word "semantic" conjures up many mysteries like, what does “machine meaning” mean.
Requirements and Cost Realism • Errors. In automated reasoning, slight errors in the data can result in absurd assertions by the reasoning system. • It may be very costly to make datasets with tolerable error levels and types. • Requirements. Determine what kind of reasoning needs to be done, e.g., • Aggregation queries (or reports) using classification (or categorization or taxonomies). • IDEAS' type-instance and super-subtype are good foundations for such. • Aggregation queries (or reports) spatio-temporally (e.g., by geographic area and / or time period). • The 4-D mereotopology will support that. • Inconsistency detection, e.g., between financial and real property reports. • Some of this might just be linking, but classification and the mereotopology will help if the data reasoned over is at different type, geographic, or temporal boundaries. • An ROI is important • Precise datasets and reasoning thereupon is usually more costly that one would think.
Summary • Data linking is an established practice with many business benefits • However, there are many tar pits for automated reasoning • Consequently, important to: • Determine precisely what kind of automated reasoning is required over what datasets • Cost realistically and evaluate ROI • Employ a formal foundation based on established mathematics, e.g., IDEAS • The DCIO wants to help