510 likes | 654 Views
Semantic Search Agent System applying Semantic Web Techniques. 2004.10.21 Jung-Jin Yang Intelligent Distributed Information System (IDIS) Lab. School of Computer Science & Information Engineering The Catholic University of Korea jungjin@catholic.ac.kr http://idis.catholic.ac.kr/jungjin.
E N D
Semantic Search Agent System applying Semantic Web Techniques 2004.10.21 Jung-Jin Yang Intelligent Distributed Information System (IDIS) Lab. School of Computer Science & Information Engineering The Catholic University of Korea jungjin@catholic.ac.kr http://idis.catholic.ac.kr/jungjin
Agenda • Semantic Search • Ontology • Ontology-based Semantic Search Agent • OnSSA • Conclusion
Searching Semantically How to handle problems in searching for information? Time intensive e.g. for the query “disease and remedy” a user cannot find a relevant result What can be the problem: 1. the query is too ambiguous 2. the used terms do not match the repository 3. the results are not properly ranked …
Moreover Cognitive demand on users in a professional domain e.g. for the query “hearing deficit” in searching medical literature through MEDLINE DB a user cannot find adequate results What can be the problem: 1. the query is too ambiguous 2. the used terms do not match the repository 3. the results are not properly ranked 4. the lacking knowledge of professional terms …
Ontology Information repository Semantic Search An ontology introduces new possibilities for query/answering Cooperative answering I need info.about deafness Tip: There 30330 documents for the desease, BUTonly 23 literatures with relevant gene names DiseaseName(x) and gene(x,Caused)
Semantic Search Develop an intelligent agent system to produce a more precise search result combine search engine and ontology corpus-based & concept-based supports continual improvement of an information retrieval according to its usage
Query Relevant resource exists yes It is found by machine agent yes Information repository User has found a resource relevant for the query User‘s information need Activities in Searching for Information no Refinement yes no It is top-ranked no User‘s request is not satisfied
Challenges Information repository User‘s information need - Queryreflects the user’s need ! Query - Information repositorycontains resources relevant to the user’s need! Relevant resource exists yes - Resources areannotatedproperly ! no It is found by software agent - Resources arerankedaccording to the relevance to the user‘s need ! - Queryrefinement closes the gap between the query and the user’s information need ! yes no It is top-ranked yes no User has found a resource relevant for the query User‘s query is not satisfied
Agenda • Semantic Search • Ontology • Ontology-based Semantic Search Agent • OnSSA • Conclusion
… Graph + limited logic Graph Labeled graph Logic ... Data Dictionary Data Schema Ontology Ontology Ontology Ontology ... RDF RDF Schema OWL KIF? Sementic Web Modeling (figured by Jim Hendler at Semantic Web Conf. 2003)
Ontology • Philosophy: A systematic account of existence • An ontology is a formal conceptualization of the world. (T. R. Gruber) • An ontology specifies a set of constraints, which declare what should necessarily hold in any possible world. • An ontological commitment is an agreement to use a vocabulary (i.e., ask queries and make assertions) in a way that is consistent (but not complete) with respect to the theory specified by an ontology: Knowledge Sharing • An ontology specifies a rich description of the : • Terminology • Concepts • Relationships between the concepts • Rules Relevant to a particular domain or area of interest
Upper-, Mid-level, Lower-Ontologies • An upper-ontology defines very broad, universal Classes and properties • Example: Cyc Upper Ontology • http://www.opencyc.org • A mid-level ontology is an upper ontology for a specific domain • A lower-ontology is an ontology for a specific domain, with specific Classes and properties. • You can merge into an umbrella, upper-level ontology by defining your ontologies root class as a subClassOf a class in the upper-ontology.
Knowledge Representation • Representation of knowledge • Description of world of interests • Usable by machines to make conclusions about that world • Intelligent System • Computational system • Uses an explicitly represented store of knowledge • To reason about its goals, environment, other agents, itself • Expressiveness vs. tractability tradeoff • How to express what we know • How to reason with what we express
Processing Knowledge = “Reasoning” • Representation of Knowledge • Access represented knowledge and process it. • Access alone is, in general, insufficient • Implicit knowledge has to be made explicit deduction methods • The results should only depend on the semantics … • And not on accidental syntactic differences in representations
Ontology Modeling & Technologies • A systematic account of existence of knowledge and intelligence for a particular domain • Ontology modeling using appropriate Tools and Language • e.g., OntoEdit, OilEd, Protégé, VOM (Visual Ontology Modeler) • e.g., XML, RDF, OWL • Reasoning capabilities: Description Logics • Provide theories and systems for expressing structured information and for accessing and reasoning with it in a principled way. • Ontology query/update for ontology repositories
Ontology Modeling (Protégé 2000):http://protege.stanford.edu
Remark • Ontology • Standards • Integration: Semantic Integration • A language for writing data • Reaching out onto the Web • Ontology Modeling • No one correct way to model a domain • Iterative ontology development process • Natural correspondence to objects and relationships in your domain of interest.
Agenda • Semantic Search • Ontology • Ontology-based Semantic Search Agent • OnSSA • Conclusion
Architecture of Intelligent Information Agent An agent is anything that can be viewed as perceiving its environment through sensors and acting upon that environment through effectors. (by Russell & Norvig) (by Enrico Franconi, Univ. of Manchester, UK)
Agenda • Motivation • Ontology • Ontology-based Semantic Search Agent • OnSSA • Conclusion
OnSSA: Ontology-based Semantic Search Agent Requirements: 1. Users are reluctant/unable to provide explicit feedback about the „quality“ of the ontology => use implicit relevance feedback suggested lists of broader/narrower terms 2. There are many types of related information and represented in different forms. => Distributed information Agent with different search strategies
OnSSA The System User query Search engine Query Models & Ontology Information IR Agent Agent 1 Query PubMed GUI Engine Information Consulting Search/ Agent 2 Agent Output OMIM requery Information Ranking Agent 3 User HUGO Information Agent 4 Result Ranking Ensemble Mining Engine Search Result
Query management: What is a user searching for? OnSSA Consulting Agent 1. Query Refinement 2. Ranking Management Note: A user‘s query is just an approximation of the, often ill-defined, user‘s information need [Saracevic75]
Logic(Jess) Translation(SweetJess) Translation(Jena) UMLS RuleML Restrict(Jena) ontology RDF+rdfschema XML+ns+xmlschema QueryModel • is a concept-based rule engine • consist of Jena, SweetJess and Jess QueryModels Architecture
Jena • Store a data of RDF and represent RDF graphs and write in N-Triples format • Load a Daml+OIL ontology in Java using Jena • Navigate an RDF graph within Jena using RDQL Jena Architecture RDQL Grammar
Jess • is a rule engine and scripting environment written entirely in JAVA • uses the Rete algorithm to process rules, a very efficient mechanism for solving the difficult many-to-many matching problem
SweetJess • is a new system for Semantic Web rules to be used in Jess • provides translation (DamlRuleML, RuleML, JessRule) • Provided by UMBC
UMLS • What’s it? • develops and distributes multi-purpose, electronic "Knowledge Sources" and associated lexical programs
Corpus-based (UMLS) GUI Search Engine MetaRule Consulting Agent SweetJess Jena Concept-based Jess Rule Ontology OnSSA The QueryModel
QueryModel Processing <?xml version="1.0" encoding="UTF-8"?> <rulebase xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="http://userpages.umbc.edu/~mgandh1/2002/06/RuleML/ruleml-sclp-prag-v13.xsd" direction="forward"> <imp> <_rlab> <ind>rule1</ind> </_rlab> <_body> <and> <atom> <_opr> <rel>GeneDisease</rel> </_opr> <var>type</var> <var>query</var> </atom> <atom> <_opr> <rel>UserInput</rel> </_opr> <var>query</var> </atom> </and> </_body> <_head> <atom> <_opr> <rel>Result</rel> </_opr> <var>query</var> <ind>gene</ind> </atom> </_head> </imp> </rulebase> RuleML GUI (UserInterface) GUI (UserInterface) GUI (UserInterface) UMLS Search Engine UMLS Search Engine UMLS Search Engine QueryModel QueryModel QueryModel MetaRule MetaRule MetaRule Jena Jena Jena Jess Jess Jess SweetJess SweetJess SweetJess Rule Rule Rule UMLS UMLS UMLS Ontology Ontology Ontology Jena Semantic Web Toolkit ② ① (reset) (defrule rule1 (GeneDisease ?type ?query) (UserInput ?query) => (assert (Result ?query gene)) ) deafness Let’s Go! (deffacts data(http://idis… (reset) (defrule rule1… (run) (deffacts data(http://idiscatholicackr/umlsRetrieveNarrower DEAFNESS Total_transitory_deafness) (http://idiscatholicackr/umlsRetrieveNarrower DEAFNESS Middle_ear_deafness) (http://idiscatholicackr/umlsRetrieveNarrower DEAFNESS Bilateral_Deafness) (http://idiscatholicackr/umlsRetrieveNarrower DEAFNESS Deafness_permanent_partial) (http://idiscatholicackr/umlsRetrieveOtherRelation DEAFNESS Cockayne_Syndrome) . . . (http://idiscatholicackr/umlsRetrieveOtherRelation DEAFNESS Lipreading) (http://idiscatholicackr/umlsRetrieveNarrower DEAFNESS Hearing_Loss_Sensorineural) (http://idiscatholicackr/umlsRetrieveBroader DEAFNESS Disability_NOS) (UserInput DEAFNESS) ) New fact & ReQuery
Introduction about Databases • MEDLINE • A database of indexed journal citations and abstracts. • Pubmed • a service of the National Library of Medicine, includes over 14 million citations for biomedical articles back to the 1950's. These citations are from MEDLINE and additional life science journals. • OMIM • Online Mendelian Inheritance in Man is a database of human genes and genetic disorders. • HUGO • Human gene nomenclature
OnSSA The System
OnSSA Information Agents
OnSSA Agent Ontology
Agenda • Semantic Search • Ontology • OnSSA • Conclusion
Conclusion • Results of OnSSA in publications • Marriage of Semantic Web and Agent technology promising for more intelligent search strategy
Web Service Other Agent Space Platform WS WS Other Other Application Application Agent Agent Agent Platform Gateway Gateway Other Server Server Agent Agent Agent Server API Server API Ontology Repository Future: Agent-based Service Ontology Structure
Conclusion • Semantic Web + Web Service + Agent Technology • The real benefit is yet to come or already..