1 / 16

Evaluating Ontology Search Towards Benchmarking in Ontology Search Paul Buitelaar, Thomas Eigner

Evaluating Ontology Search Towards Benchmarking in Ontology Search Paul Buitelaar, Thomas Eigner Competence Center Semantic Web & Language Technology Lab DFKI GmbH Saarbrücken, Germany. Overview. Ontology Search Knowledge reuse (integration with Ontology Learning) OntoSelect

Download Presentation

Evaluating Ontology Search Towards Benchmarking in Ontology Search Paul Buitelaar, Thomas Eigner

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Evaluating Ontology Search Towards Benchmarking in Ontology Search Paul Buitelaar, Thomas Eigner Competence Center Semantic Web & Language Technology Lab DFKI GmbH Saarbrücken, Germany

  2. Overview • Ontology Search • Knowledge reuse (integration with Ontology Learning) • OntoSelect • Browse (ontologies, labels, classes, properties) • Search by topic • Evaluating Ontology Search • Benchmark (evaluation) data set • Experiment (compare SWOOGLE, OntoSelect) • Conclusions

  3. Ontology Search • There are more and more ontologies published on the (Semantic) Web • Available as RDFS or OWL files (also still DAML) • Opens up possibilities for reuse of knowledge • Access through ontology search engines and/or (manual/automatic) organization in ontology libraries • But: increasingly harder to find the right one for your application • Increasing research in ontology search/selection (Alani et al., Buitelaar et al., Ding et al., Sabou et al.) – SWOOGLE, OntoSelect, Watson

  4. OntoSelect • Ontology Library and Search Engine http://olp.dfki.de/OntoSelect • Monitors the web for ontologies with automatic harvesting and indexing • Browse and search • On ontologies, classes, properties and (multilingual) labels • Ontology search integrates relevance feedback over Wikipedia for search term • Ontology publishing • Submit ontologies - will be automatically integrated • Statistics • On formats, languages, labels used, ontology publishing Paul Buitelaar, Thomas Eigner, Thierry Declerck OntoSelect: A Dynamic Ontology Library with Support for Ontology Selection In: Proc. of the Demo Session at the International Semantic Web Conference, Hiroshima, Japan, Nov. 2004.

  5. OntoSelect – Browse

  6. Ontology Search

  7. Keyword as Wikipedia Topic

  8. Keyword Expansion (Extraction) Relevance Feedback from Wikipedia

  9. Ranked Results (Browsable)

  10. Search Criteria • Relevance criteria address ontology content, structure, status: • Coverage - Term Matching • How many of the terms in a text collection are covered by labels for classes and properties? • Structure - Properties Relative to Classes • How detailed is the knowledge structure that the ontology represents? • Connectedness - Number of Included Ontologies • Is the ontology connected to other ontologies and how well established are these?

  11. Evaluation – Benchmark • Benchmark: 15 Wikipedia topics and 57 manually assigned ontologies out of 1056 cached through OntoSelect • 15 Wikipedia topics were selected out of the set of all (37284) class/property labels in OntoSelect, by: • Filtering out labels that did not correspond to a Wikipedia page > 5658 labels / topics • 5658 labels were used as search terms in SWOOGLE to filter out labels that returned less than 10 ontologies (out of the 1056 in OntoSelect) > 3084 labels / topics • Out of 3084 labels we manually selected useful topics, e.g. we left out very short labels (‘v’) and very abstract ones (‘thing’) > 50 topics • We randomly selected 15 for which we manually checked the ontologies retrieved from OntoSelect and SWOOGLE > 15 topics with 57 assigned ontologies

  12. Evaluation – Benchmark by Topic • 15 (Wikipedia) topics with number of assigned ontologies: • Atmosphere (2) • Biology (11) • City (3) • http://www.mindswap.org/2003/owl/geo/geoFeatures.owl • http://www.glue.umd.edu/ katyn/CMSC828y/location.daml • http://www.daml.org/2001/02/geofile/geofile-ont • Communication (10) • Economy (1) • Infrastructure (2) • Institution (1) • Math (3) • Military (5) • Newspaper (2) • Oil (0) • Production (1) • Publication (6) • Railroad (1) • Tourism (9)

  13. Evaluation – Experiment • Comparison of (average) results between SWOOGLE and OntoSelect • Use OntoSelect benchmark • 15 topics (queries) • 57 assigned ontologies (relevance assessments) • 1056 ontologies (data set) • Use different configurations for OntoSelect • With/without keyword expansion/extraction • With/without class names (in addition to labels) • With/without property labels • Weighting of relevance criteria • …

  14. Evaluation – Results

  15. Evaluation – Weighting of ‚title‘

  16. Conclusions • Conclusions on evaluation are too early • Many more configurations (weights) to compare • Extend the benchmark • Comparison with other ontology search engines • Main contribution of the presented work • First comprehensive benchmark for topic-driven evaluation of ontology search • (Extended) Benchmark will be made publicly available http://olp.dfki.de/OntoSelect

More Related