The Use of Machine-Generated Ontologies in Dynamic Information Seeking

The Use of Machine-Generated Ontologies in Dynamic Information Seeking Giovanni Modica Avigdor Gal Hasan M. Jamil CoopIS’2001 Trento, Italy

Motivating example CoopIS’2001 Trento, Italy

Preliminaries • Definition: An ontology is an explicit representation of a conceptualization. (Gruber 1993) • Conjecture I: Applications in a given domain base their information exchange on some (shared) underlying ontology. • Observation: Application in a given domain use different ontology representation. • Conjecture II: Given an application A such that A utilizes an ontology representation OA, and an ontology O, there exists an invertible mapping fA such that fA(OA)=O CoopIS’2001 Trento, Italy

OA= fA-1(fB(OB)) Problem description • Given two applications A and B, such that A utilizes an ontology representation OA and B utilizes an ontology representation OB, introduce a mapping fBA such that fBA (OB)=OA • In a perfect world: • O is known. • fA is known. • fB is known. • Alas: • O is unknown. At best, an approximation of O exists, in a form of a standard. • fA and fB are unknown: lack of documentation, the mental state of a designer, etc. CoopIS’2001 Trento, Italy

Proposed solution • Given two applications A and B, such that A utilizes an ontology representation OA and B utilizes an ontology representation OB, introduce a mapping fBA such that • fBA depends on the ontology representation. • A matching is associated with a “degree of confidence” in the matching. • 0 identifies non-matching terms. • 1 identifies a crisp matching. CoopIS’2001 Trento, Italy

Ontology representation • Dynamic information seeking: • HTML forms • Labels • Input fields • Scripts • Assumptions: • Labels represent terms in an ontology (e.g., Pick-up Date). • Input fields provide constraints on the value domains (e.g., {Day, 1,…31}). • Scripts, among other things, suggest a precedence relationship (e.g., Pick-up Locations is required before selecting a Car Type). CoopIS’2001 Trento, Italy

Ontology representation • Conceptual modeling approach • Based on Bunge: • Terms (things) • Values • Composition • Precedence CoopIS’2001 Trento, Italy

URL (e.g. http://www.avis.com) Phase 1 Parsing Phase 2 Labeling Phase 3 Ontology Phase 4 Merging KB Ontology Creation Submission HTML Parsing Refined Ontology Form Rendering Thesaurus Matching Algorithms DOM Tree Label Identification Target/Candidate Ontology CandidateOntology Target Ontology HTML Elements rules KB FORM Elements Ontology extraction and matching CoopIS’2001 Trento, Italy

Phase 1: Parsing CoopIS’2001 Trento, Italy

Phase 2: Labeling CoopIS’2001 Trento, Italy

Merging Heuristics for the ontology merging (Frakes and Baeza-Yates, 1992): • Textual matching: Date  date Pickup  pickup • Ignorable characters removal: *Country  country • De-hyphenation: Pick-up  Pickup Pickup  Pick up • Stop terms removal: Date of Return  Return Date Stop terms: a, to, do, does, the, in, or, and, this, those, that, … etc. • Substring matching: Pickup Location Code  Pick-up location (66%) • Content matching: Dropoff Day (1,..,31)  Return Day (1,..,31) (100%) Dropoff  Return • Thesaurus matching: Dropoff Location  Return Location (100%) CoopIS’2001 Trento, Italy

Phase 4: Merging CoopIS’2001 Trento, Italy

Recall: Precision: Preliminary Results • Two metrics are used for performance analysis (Frakes and Baeza-Yates, 1992): • Recall (completeness) • Precision (soundness) Parameters: • tr: number of terms retrieved • tm: number of terms matched • te : number of terms effectively matched CoopIS’2001 Trento, Italy

Preliminary Results Example: # of terms in Ontology1: 20 # of matches identified: 15  Recall: 75% (15/20) # of effective matches: 10  Precision: 66% (10/15) A third metric is used to compare the recall and precision. For a precision value P, a recall value R and an importance measure b, the combined metric E is calculated as (Frakes and Baeza-Yates, 1992): CoopIS’2001 Trento, Italy

Preliminary Results CoopIS’2001 Trento, Italy

Summary and Future Work • We have introduced: • Automatic ontology creation • Automatic matching process • Preliminary results • Future work oriented towards: • Incorporation of query facilities into the tool • Automatic navigation of web sites for ontology extraction • Dynamic translation between queries against the target ontology to queries against the multiple candidate ontologies CoopIS’2001 Trento, Italy

The Use of Machine-Generated Ontologies in Dynamic Information Seeking

The Use of Machine-Generated Ontologies in Dynamic Information Seeking

Presentation Transcript

540-310 Human Factors in Information Seeking and Use

Information seeking

Seeking Balance in the Land -use Marketplace

Information Seeking Behaviors

Narratives in Collaborative Information Seeking

Information Ontologies

Information Seeking (I)

540-310 Human Factors in Information Seeking and Use

Information Seeking Behavior of Scientists

Information in the Digital Environment Information Seeking Models

Information in the Digital Environment Information Seeking Models

Dynamic Aspects of SEKT Legal Ontologies

The information-seeking behaviour of the virtual scholar: from use to users

Information seeking behaviour of academics

540-310 Human Factors in Information Seeking and Use

540-310 Human Factors in Information Seeking and Use

Seeking Information – the Tools

Use of Ontologies in the Life Sciences: BioPax

540-310 Human Factors in Information Seeking and Use

540-310 Human Factors in Information Seeking and Use

Dynamic Queries for Visual Information Seeking Ben Shneiderman