1 / 24

Outline

Outline. EDEN Project Overview InfoSleuth in a microsecond The Ontology in InfoSleuth Value Mapping and the Environmental Data Registry Virtual demo. Environmental Data Exchange Network. The challenge:

Download Presentation

Outline

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Outline • EDEN Project Overview • InfoSleuth in a microsecond • The Ontology in InfoSleuth • Value Mapping and the Environmental Data Registry • Virtual demo

  2. Environmental Data Exchange Network • The challenge: • Acquisition, use and dissemination of environmental information is of increasing strategic importance to EPA, DOD, DOE, and EEA • EDEN is an application of MCC's InfoSleuth technology • Employs intelligent agent technology through the Internet to conduct concept-based searches of heterogeneous, distributed information • The EDEN Project demonstrates how organizations can save time and money: • Provides easy access over intranet or the Internet • Enables users to access information from multiple sources • Simplifies the exchange and sharing of data • Reduces the reporting burden • Brings information together for presentation and analysis

  3. Common Set of Requirements • Reduce the reporting burden imposed by the parties on each other • Sharing of best available and most timely information • Enable users to access information from multiple sources • Coordinate only the common vocabulary – not the end use of information resources; focus on the inputs with each participant; individually interpreting and communicating outputs

  4. Pilot Databases • CERCLIS-3: EPA Superfund (Oracle, VA) • ITT: EPA Remediation Technology (MS-Access, TX) • HazDat: EPA Hazardous Substances (Sybase, GA) • ERPIMS: Air Force Env. Restoration (Oracle, TX) • EEA: Basel Convention (Ms-Access, TX) • IRDMIS: Army Installation Restoration (Oracle, MD) • DOE INEEL (Oracle, ID) • DOE ORNL (Oracle, TN)

  5. InfoSleuth • System of “competent” agents for dynamic, scalable (SQL-based) access to heterogeneous distributed information sources • Ontology-based information management • Advertise-discover paradigm supported by brokering over semantic constraints

  6. InfoSleuth System • Java-based agents • Knowledge Query Manipulation Language message layer provides speech-act agent interface • Agent conversation shell provides structure for KQML messages • Open KnowledgeBase Connectivity language provides semantic communication layer • Brokering reasoning provided by Logical Data Language, LDL++

  7. More InfoSleuth Agents • JDBC Resource agents translate between application domain ontology and database schemata • Multi-resource query agent uses either LDL++ or Oracle to support query decomposition and result recomposition • Value mapper translates to/from canonical value domains • Text agent supports ontology-based query • Task execution agent manages CLIPS rule base for task planning and subscription maintenance • Sentinel and Deviation detection agents cooperate to detect complex event patterns

  8. Basic InfoSleuth Application Recipe • 6 cups ontology • 3 cups resource agent configuration • 1-3 cups user interface development • Lightly brown the multi-resource query agent • Pour in other agents out of the box • Stir • Serve • ... • add or remove resource agents as desired • add other functionality with more configuration effort

  9. resource agent Viewer Applets Viewer Applets Viewer Applets User Resources Resources resource agent mapping info mapping info SQL text User resource agent valuemap agent ontology agent broker agent broker agent multi- query agent multi- query agent task agent task agent user agent user agent Refined Data User A Distributed Query

  10. Purpose of The Ontology in InfoSleuth • To describe the domain with minimal ambiguity • the structure defines the domain • documentation strings • To be the integration hub for the DB schema • query relaxation through the taxonomy • vertical fragmentation • multi-resource path expressions • To provide the language of the queries and the language of expression of the results • value mapping

  11. Expressing the Ontology • OKBC (Open Knowledge Base Connectivity): a standard for Knowledge Representation • Classes, Slots, Facets: • (class Observed_Contamination) • (template-slot-of analysis_methodObserved_Contamination) • (template-facet-value :VALUE-TYPEanalysis_methodObserved_Contamination :STRING) • (template-slot-of siteObserved_Contamination) • (template-facet-value :VALUE-TYPEsiteObserved_Contamination Eden_Site) • Subclass and Instance-Of Links

  12. Ontology Features • Value Mapping Modelling Quantity Unit Of Measure canonical unit height Person Distance Meter unit unit Foot data-type STRING

  13. Value mapping requirements • Translate terms in queries • Allow users to choose a coding scheme for querying • Query each database in terms of its own coding scheme • Translate results of queries • Facilitate merging of data from different sources • Display results according to user preference

  14. Value mapping and the ontology • A class has one or more slots • Each slot has a conceptual domain name • Each slot has preferred value domain • Resource Agents must advertise in the preferred value domain • possibly translating to/from a different value domain • Users may query and view data in a different value domain • User Agent handles translation to/from preferred value domain

  15. EDR contents We use a specialized resource agent (map agent) to access the EDR

  16. Additions to EDR • Downloaded files of permissible values for CAS number and Chemical name (Merck index) from EPA site • Assigned value meanings • Created value domains for CAS code, CAS padded, ycode; loaded permissible values • Added 3 extra chemical names because Merck index file was incomplete

  17. Linking EDR to EDEN ontology

  18. View of the EDR CREATE VIEW edr_map (conceptual_domain, cd_id, value_domain, vd_id, preferred_domain, pd_id) AS SELECT emc.conceptual_domain, emc.value_domain, pref.pv_nm, act.pv_nm FROM edr_map_class emc, cd_vm_assoc a, permissible_value pref, permissible_value act WHERE a.cd_id = emc.cd_id AND a.vm_id = act.vm_id AND a.vm_id = pref.vm_id AND emc.vd_id = act.vd_id AND emc.pd_id = pref.vd_id

  19. Query Processing

  20. Query translation SELECT name FROM site WHERE state = ‘Texas’’ translated to SELECT name FROM site WHERE state = ‘TX’

  21. Result translation Translated to

  22. EDR lookup SELECT preferred_value FROM edr_map WHERE actual_value = ‘Benzene’ AND coding_scheme = ‘chemical_name’ AND conceptual_domain = ‘chemical_substance’

  23. Outstanding issues • No match in EDR for database value • differences in case (‘Texas’, ‘TEXAS’) • CAS number format (dashes, leading zeros) • word order (‘n-Propyl benzene’, ‘Benzene, n-Propyl’) • bad data • Functional mapping needed • Approximate string matching

  24. DEMO

More Related