330 likes | 433 Views
Applications. Chapter 9, Cimiano Ontology Learning Textbook Presented by Aaron Stewart. Typical Applications of Ontologies. Agent communication Data integration Description of service capabilities for matching and composition purposes Formal verification of process descriptions
E N D
Applications Chapter 9, Cimiano Ontology Learning Textbook Presented by Aaron Stewart
Typical Applications of Ontologies • Agent communication • Data integration • Description of service capabilities for matching and composition purposes • Formal verification of process descriptions • Unification of terminology across communities
Text Applications of Ontologies • Information Retrieval (IR) • Clustering and Classification of Documents • Semantic Annotation • Natural Language Processing
Task-Based EvaluationRequirements • Algorithm output can be quantified • Task can use background knowledge • Ontology is an additional parameter • Output can be traced to the ontology
Contents • Text Clustering and Classification • Information Highlighting for Supporting Search • Related Work
Text Clustering and Classification • What is the difference?
Text Classification Arrows Weather Flat shapes 3-D forms Smile!
Dot Kom Project • One of many competitions
Approaches • Bag of words • Manually engineered MeSH Tree Structures • Automatically constructed ontologies
What is a “Bag of Words” anyway? quick brown the fox
Bag of Words the quick brown fox jumps over the lazy dog (2)
Note on Ontologies • Our ontologies (“micro”) • Like a database record schema • Their ontologies (“macro”) • Like WordNet
Clustering • Hierarchical Agglomerative Clustering • Bi-Section K-means • “A Comparison of Document Clustering Techniques” • www.cs.sfu.ca/~wangk/894report/chen1.pdf
Document Representations • Bag of Words • Certain words + ontology -> extended features • Strategies: add, replace, only
Cluster Metrics P : computer-generated clusters L : human-created clusters P, L: sets of clusters (partitioning)
Information Highlighting for Supporting Search • Challenge: • 10 minute limit • KMi Planet News web site • Compile a list of important • People • Technologies
Information Highlighting for Supporting Search • Tools: • Regular browser • Magpie • ESpotter • C-PANKOW
Teams • A : web browser only • B : web browser with AKT information • C : web browser with AKT++ information
Conclusions (for this section) • Generated ontologies can be comparable to hand-crafted ontologies • Humans can trust the computer too much! (Group C drop in score)
Related Work • Query Expansion • Information Retrieval • Text Clustering and Classification • Natural Language Processing
Natural Language Processing • Ambiguity resolution • Bank • Compounds • Headache medicine • Vague words • With, of, has • Selectional restrictions • Anaphora
More Applications • Word sense disambiguation • Classification of unknown words • Named Entity Recognition (NER) • Anaphora Resolution • Question Answering • Who wrote the Hobbit? • Tolkien is the author of the Hobbit. • Information Extraction • AUTOSLOG, ASIUM
Analysis/Conclusion • Pro/con: • Focused on two systems • Passing survey of others