1 / 22

An Ontology Creation Methodology: A Phased Approach

An Ontology Creation Methodology: A Phased Approach. Jon Atle Gulla Norwegian University of Science and Technology; Norway jag@idi.ntnu.no Vijay Sugumaran Oakland University, USA sugumara@oakland.edu. Agenda. Ontology development Traditional ontology learning

wray
Download Presentation

An Ontology Creation Methodology: A Phased Approach

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An Ontology Creation Methodology: A Phased Approach Jon Atle Gulla Norwegian University of Science and Technology; Norway jag@idi.ntnu.no Vijay Sugumaran Oakland University, USA sugumara@oakland.edu

  2. Agenda • Ontology development • Traditional ontology learning • Limitations of ontology learning • A phased approach to ontology learning

  3. The Challenge • How to develop large complex ontologies? • How to keep ontologies updated in dynamic domains?

  4. Traditional ontology engineering approach Project:Form team of ontology and domain experts Ontology & domain experts:Collaborative manual modeling process Domain experts:Verify ontology against domain knowledge Ontology experts:Verify ontology against syntactic and semantic quality measures Expensive and time-consuming approach Stable domains assumed Ontology learning approach: Domain experts:Find representative domain text Tool:Extract candidate classes, individuals and properties automatically from domain texts Ontology & domain experts:Verify candidate structures and complete ontology Can also be used to verify domain quality of existing ontology Cost-effective approach Not unproblematic in dynamic domains Ontology Modeling vs. Learning

  5. Agenda • Ontology development • Traditional ontology learning • Limitations of ontology learning • A phased approach to ontology learning

  6. Ontology Learning Basis • People communicate using domain-specific concepts • People document using domain-specific concepts • Ontology learning: Extract ontology structures from written documentation • Requirements: • Documents representative for domain terminology • Documents cover all the terminology • Well-defined and consistent use of terminology in domain Realm of ontology engineering Ontology discussions Realm of ontology learning Ontology in use

  7. Levels of Ontology Learning Degree of difficulty  x,y(manager(x,y) → report(y,x)) Rules FINANCE(ag:SPONSOR, go: PROJECT) Relations is_a(MANAGER, EMPLOYEE) Concept hierarchies Concepts PROJECT Synonyms (leader, manager, lead) Terms sponsors, costs, charter

  8. Term extraction Linguistic analysis Statistical analysis Synonyms Classification-based techniques Distribution-based techniques Concept formation Structure recognition Keyphrase generation Instance learning Concept hierarchy Clustering Lexico-syntactic patterns Head-modifier approaches Subsumption approaches Classification-based techniques Relations Association rules Concept vectors Rules Structure recognition for meta-property recognition Dependency trees and path similarities Ontology Learning Strategies

  9. Ontology Learning Process Scope management WBS Business need Constituent components Product description ... Abstract elements Constraints Properties Rules PMBOK Domain text Concept candidates Search ontology Reference set Manual selection of candidates and completion of model Automatic extraction of concept and relationship candidates

  10. Scope/NNP planning/NN is/VBZ the/DT process/NN of/IN progressively/RB elaborating/VBG and/CC documenting/VBG the/DT project/NN work/NN (/( project/NN scope/NN )/) that/WDT produces/VBZ the/DT product/NN of/IN the/DT project/NN ./. Scope planning is the process of progressively elaborating and documenting the project work (project scope) that produces the product of the project. Scope plan process progress elaborate document project work project scope produce product project {scope planning, process, project work, project scope, product, project} {(scope planning, 0.0097), (project scope, 0.0047), (product, 0.0043), (project work, 0.0008), (project, 0.0001), (process, 0.0000)} Ex 1. Learning Concept/Individual Candidates Scope planning is the process of progressively elaborating and documenting the project work (project scope) that produces the product of the project. POS tagging Stopword removal (571 words) Lemmatization/stemming (POS tags not shown) Select consecutive nouns as candidate phrases Calculate tf.idf score for phrases

  11. Classes Relevant to the Drama Genre • Data sources: IMDB, Wikipedia, Videoload • Keyphrase extraction technique • Noun phrases ranked according to various statistical measures

  12. Concept profiles Lucene Document indexer Light stemmer Lucene Paragraph indexer Concept profile builder Concept similarity calculation Lucene Sentence indexer Relationship merger Tokenizer Association rules miner GATE Sentence splitter GATE Tagger GATE Lemmatizer GATE Noun phrase extractor Noun phrase indexer Association rules Ex 2. Learning Relationship Candidates

  13. Relationships Relevant to Drama Genre • Association rules on extracted concepts

  14. Automatic OWL Generation

  15. Agenda • Ontology development • Traditional ontology learning • Limitations of ontology learning • A phased approach to ontology learning

  16. Limitations of Ontology Learning • Different techniques produce different results • Different data sources produce different results • Lost control over process • Extensive verification of final ontology needed • New data hard to combine with old data

  17. Agenda • Ontology development • Traditional ontology learning • Limitations of ontology learning • A phased approach to ontology learning

  18. Ontology Learning for Entertainment Domain • Ontology evolution for DeutscheTelecom’s Videoload downloadservice • What does Brangelina mean? • Should Pitt be Brad Pitt or Michael Pitt? • Actor vs. Schauspieler? • All movies of Brad Pitt? • Last movie of Pitt?

  19. Ontology Learning Project • Duration: Nov 2007 – Nov 2009 • Domain: movie download service • Ontology analysis and creation based on indexed noun phrases from movie documents • Ontology used for search and navigation on top of FAST search platform • Ontology learning challenges: • Domain changes from one day to another • No consistent domain terminology • No professional domain terminology • Multiple languages • Movies about anything... unlimited domain • Ontology needs to be up to date to support search

  20. Ontology Workbench • 3 phases that are carried out independently • Crawling into Lucene indices • Supervised extraction of candidates • Combining candidates into ontology structures

  21. Interactive Ontology Development Expandable indices Subset of data source Focus of analysis List of techniques Partial results Stored results Set operations for combining results

  22. Thank you

More Related