1 / 14

Ontological Framework for Enabling Free-Form Search in Scientific Discovery

Leveraging web semantics for user-friendly free-form search in scientific data, using NLP and ontology model matching. Explore system components and performance benefits in this innovative framework.

Download Presentation

Ontological Framework for Enabling Free-Form Search in Scientific Discovery

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Ontological Framework for Enabling Free-Form Search in Scientific Discovery Chaitali Gupta, Madhusudhan Govindaraju Grid Computing Research Laboratory SUNY Binghamton E-science Microsoft Workshop 2008: Semantics Birds of a Feather Session:

  2. Motivation • Most computer users today do not have to write programs • most end users of Grid and scientific data sets should be shielded from low-level details • Web Search engines search billions of web pages • use Natural Language Processing (NLP) and Information Retrieval (IR) technologies • return many links for any given search • XML based technology and ontologies can be used to categorize and organize information • machine-readable and understandable manner • retrieve specific information from Grid/scientific services. E-science Microsoft Workshop 2008: Semantics Birds of a Feather Session:

  3. Project Vision • Our vision is that Web semantics can be leveraged to build search engine like interfaces even for Grid/Scientific Application Meta-Data. • abstract away the fundamental complexity of XML based services specifications and toolkits • Add a search box on portal dashboards • Automatically convert queries to Job description specification formats E-science Microsoft Workshop 2008: Semantics Birds of a Feather Session:

  4. Related Work • MDS. • WSRF compliant service to publish/retrieve resource information • Condor ClassAds. • Combines schema, data, and query in a simple but powerful query specification language. • Condor Gangmatching. • Overcomes bilateral matching limitations of the ClassAds. E-science Microsoft Workshop 2008: Semantics Birds of a Feather Session:

  5. Comparing with SPARQL E-science Microsoft Workshop 2008: Semantics Birds of a Feather Session:

  6. Scope of Free-Form Queries • The problem of processing and acting upon arbitrary English is an extremely challenging • actively addressed in the AI community • Use many techniques from NLP and semantic web • Scope of our work is therefore limited • cannot accept any free-form query • designed to accept a limited form of English with a vocabulary taken from the ontology. E-science Microsoft Workshop 2008: Semantics Birds of a Feather Session:

  7. Example queries for New York State Grid (NYSGrid) • List all sites of NYSGrid • All Sites of NYSGrid with Xeon processors • Processor configuration of nodes at Binghamton site of NYSGrid • All machine names in NYSGrid with CPU speed greater than 2.0GHz speed • Status of job ID 117 running on NYSGrid • Names of 16 free nodes on the NYSGrid with at least 4GB of memory • List all nodes of NYSGrid having CPU speed greater than 1Ghz and less than 4 Ghz E-science Microsoft Workshop 2008: Semantics Birds of a Feather Session:

  8. Example ontology model E-science Microsoft Workshop 2008: Semantics Birds of a Feather Session:

  9. System Components • WSDL Processor • User Query Interface • Query Processor • Match Processor • Ontology Matcher • Dictionary Matcher • direct, stripped matching, hypernyms, hyponym • Lexicon • how people use words etc. • Relevance Checker • Glossary, input and output parameters of the Web service E-science Microsoft Workshop 2008: Semantics Birds of a Feather Session:

  10. Example query that lights up the model • The Ontology Matcher retrieves the ontologies from the ontology repository and matches them with the user query. • Ontologies built in OWL for storing the vocabularies • concepts include “CPU”, “memory”, “storage”, “job”, etc. • use Jena to process OWL models/statements • <subject, object, predicate> E-science Microsoft Workshop 2008: Semantics Birds of a Feather Session:

  11. System Components • Queries that hit Ontology Matcher have an average of 95% - 96% better performance benefit than those requiring both Ontology and Dictionary Matcher. E-science Microsoft Workshop 2008: Semantics Birds of a Feather Session:

  12. Performance of System Components • Execution time taken by the major components E-science Microsoft Workshop 2008: Semantics Birds of a Feather Session:

  13. System Components • Recall and Precision increases when domain dependent ontologies are considered. E-science Microsoft Workshop 2008: Semantics Birds of a Feather Session:

  14. Research Challenges • Design algorithms to automatically infer the context of user queries and map them to an appropriate set of Grid and scientific services. • Automatically extend and update domain knowledge using Semantic Web techniques and WordNet. • Build a feedback loop for cases that don’t work • Enable construction of simple workflows • multiple Grid services may be needed for a query • merging results from different services E-science Microsoft Workshop 2008: Semantics Birds of a Feather Session:

More Related