620 likes | 649 Views
PROMPT: Algorithm and Tool for Automated Ontology Merging and Alignment. Natalya F. Noy Stanford Medical Informatics Stanford University. Outline. Definitions and motivation The PROMPT ontology-merging algorithm Incremental algorithm (PROMPT) Statistical algorithm (Anchor-PROMPT)
E N D
PROMPT:Algorithm and Tool for Automated Ontology Merging and Alignment Natalya F. Noy Stanford Medical Informatics Stanford University
Outline • Definitions and motivation • The PROMPT ontology-merging algorithm • Incremental algorithm (PROMPT) • Statistical algorithm (Anchor-PROMPT) • The tools • Evaluation • Future work
Ontologies • Characterize concepts and relationships in an application area, providing a domain of discourse • Enumerate concepts, attributes of concepts, and relationships among concepts • Define constraints on relationships among concepts
Why do we need ontologies • An ontology provides a shared vocabulary for different applications in a domain • An ontology enables interoperation among applications using disparate data sources from the same domain
Ontologies Are Everywhere • Ontologies have been used in academic projects for a long time • Knowledge sharing and reuse • Reuse of problem-solving methods • Ontologies are becoming widely used outside of academia • Categorization of Web sites (e.g. Yahoo!) • Product catalogs
Need for Ontology Merging • There is significant overlap in existing ontologies • Yahoo! and DMOZ Open Directory • Product catalogs for similar domains
Need for Ontology Merging and Integration • Need to merge or align overlapping ontologies • Chemdex™—a portal for accessing life-science–supply catalogs • Workshop on “Ontologies and Information Sharing” at IJCAI’2001 • 6 out of 18 papers (1/3) are about ontology merging and integration
Existing Approaches • Ontology design and integration • term matching (Stanford SKC, ISI) • graph-based analysis (Stanford SKC) • transformation operators (Ontomorph at ISI) • merging tools (Chimaera at Stanford KSL) • Object-oriented Programming • subject-oriented programming (IBM) • “subjective” views of classes • transformation operations • concentrates on methods rather than relations
Existing Approaches (II) • Databases • develop mediators and provide wrappers • define a common data model and mappings • define matching rules to translate directly Most of these approaches do not provide any guidance to the user, do not use structural information
Outline • Definitions and motivation • The PROMPT ontology-merging algorithm • Incremental algorithm (PROMPT) • Statistical algorithm (Anchor-PROMPT) • The tools • Evaluation • Future work
PROMPT • Our approach is: • Partial automation • Algorithms based on • concept-representation structure • relations between concepts • user’s actions • Our approach is not: • Complete automation • Algorithm for matching concept names
Knowledge Model • A generic knowledge model of OKBC (Open Knowledge-Base Connectivity Protocol) • Classes • Collections of objects with similar properties • Arranged in a subclass–superclass hierarchy • Instances • Slots • First-class objects in a knowledge base • Binary relations describing properties of classes and instances • Facets • Constraints on slot values (cardinality, min, max)
Perform automatic updates Find conflicts Make suggestions The PROMPT Algorithm Make initial suggestions Select the next operation
Employee Agencyemployee subclass of subclass of Agent has client agent for Traveler Customer Example: merge-classes Employee Agencyemployee subclass of subclass of Agent Agent has client agent for Traveler Customer
Employee Agencyemployee Employee Agencyemployee subclass of subclass of subclass of subclass of Agent Agent has client agent for agent for Traveler Customer Traveler Customer Example: merge-classes (II)
Analyzing Global Properties Locally • Global properties • classes that have the same sets of slots • classes that refer to the same set of classes • slots that are attached to the same classes • Local context • incremental analysis • consider only the concepts that were affected by the last operation
The PROMPT Operation Set • Extends the OKBC operation set with ontology-merging operations • merge classes • merge slots • merge instances • copy of a class • deep or shallow • with or without subclasses • with or without instances • …
After a User Performs an Operation • For each operation • perform the operation • consider possible conflicts • identify conflicts • propose solutions • analyze local context • create new suggestions • reinforce or downgrade existing suggestions
Conflicts • Conflicts that PROMPT identifies • name conflicts • dangling references • redundancy in a class hierarchy • slot-value restrictions that violate class inheritance
Agent Example: merge-classes Agent Agent
Operation Steps: merge-classes • Own slot and their values for the new class ask the user in case of conflicts or use preferences • Template slots for the new class union of template slots of the original classes • Subclasses and superclasses for the new class • Conflicts • Suggestions
agent for Template Slots Copy template slots that don’t exist in the merged ontology Agent Agent agent for Agent
Agent Agent Agent client client Template Slots Attach the slots that have already been mapped has client
Agency employee superclass Employee superclass Subclasses And Superclasses If a superclass (subclass) exists, re-establish the links Agent Agent Agent
For example, allowed class Customer facet value dummy frame Customer _temp agent for facet value Dangling References Agent Agent agent for Agent
Additional Suggestions: Merge Slots If slot names at the merged class are similar, suggest to merge the slots Agent client has client
Agency employee Agent Reservation Client Additional Suggestions: Merge Classes If the set of classes referenced by the merged class is the same as the set of classes referenced by another class, suggest a merge has clients handles reservations
Agent Additional Suggestions: Merge Classes If names of superclasses (subclasses) of the merged class are similar, suggest to merge the classes Employee Agency employee superclass superclass
Agent Check for Cycles If there is a cycle, suggest removing one of the parents Person superclass Employee Agency employee superclass
To Summarize • Perform the actual operation • For the concepts (classes, slots, and instances) directly attached to the operation arguments • perform global analysis for new suggestions • Perform global analysis for new conflicts
Non-local context Classes directly referenced by C Slots in C C Context
Anchor-PROMPT: Using Non-Local Contexts • Input: • A set of anchor pairs • Output: • A set of related terms with similarity scores • Where do anchors come from? • Lexical matching • Interactive tools • User-specified Ontology 1 Ontology 2
Similarity Score • Generate a set of all paths (of length < L) • Generate a set of all possible pairs of paths of equal length • For each pair of paths and for each pair of nodes in the identical positions in the paths, increment the similarity score • Combine the similarity score for all the paths
TRIAL Trial PERSON Person CROSSOVER Crossover PROTOCOL Design TRIAL-SUBJECT Person INVESTIGATORS Person POPULATION Action_Spec PERSON Character TREATMENT-POPULATION Crossover_arm Anchor-PROMPT: Initial Results
Knowledge Model Assumptions The only assumption: An OKBC-compliant knowledge model
Outline • Definitions and motivation • The PROMPT ontology-merging algorithm • Incremental algorithm (PROMPT) • Statistical algorithm (Anchor-PROMPT) • The tools • Evaluation • Future work
Protégé-2000 • An environment for • Ontology development • Knowledge acquisition • Intuitive direct-manipulation interface • Extensibility • Ability to plug in new components
Protégé-200 plugins • Domain-specific user-interface plugins • Alternative back ends for archival storage • Utility programs for knowledge-acquisition tasks • End-user applications
Protégé-based PROMPT tool • Protégé-2000 • has an OKBC-compatible knowledge model • allows building extensions through a plug-in mechanism • can work as a knowledge-base server for the plug-ins
The PROMPT tool features • Setting a preferred ontology • Maintaining the user’s focus • Providing feedback to the user • Preserving original relations • subclass-superclass relations • slot attachment • facet values • Linking to the direct-manipulation ontology editor • Logging operations
Outline • Definitions and motivation • The PROMPT ontology-merging algorithm • Incremental algorithm (PROMPT) • Statistical algorithm (Anchor-PROMPT) • The tools • Evaluation • Future work
Evaluation • Knowledge-based systems are rarely evaluated • We can use software-engineering approaches to empirical evaluation of tools • We need to develop additional knowledge-base measurements
Questions we asked • How good are PROMPT’s suggestions and conflict-resolution strategies? • Does PROMPT provide any benefit when compared to a generic ontology-editing tool (Protégé-2000)?
What we were trying to find out • The benefit that the tool provides • Productivity benefit • Quality improvement in the resulting ontologies • User satisfaction • Precision and recall of the tool’s suggestions