260 likes | 517 Views
Report on internship @ DERI, Galway. Jan Zemanek (DIKE, UEP; SmILE, DERI) jan.zemanek@deri.org. DERI, Galway in a nutshell. http://www.deri.ie & http://blog.deri.ie/ DERI – D igital E nterprise R esearch I ntitute at National University of Ireland „Making Semantic Web real.“
E N D
Report on internship @ DERI, Galway Jan Zemanek (DIKE, UEP; SmILE, DERI) jan.zemanek@deri.org
DERI, Galway in a nutshell • http://www.deri.ie & http://blog.deri.ie/ • DERI – Digital Enterprise Research Intitute at National University of Ireland • „Making Semantic Web real.“ • Research areas • Semantic Web (cluster) • Semantic Web Services (cluster) • eLearning (cluster) • Director: Prof. Dr. Stefan Decker • Vice director: Prof. Dr. Manfred Hauswirth • around 100 members • 2 stable Czech members: Dr. Tomas Vitvar, Vit Novacek • Tomas has launched his own weblog lately, you can find it at http://www.vitvar.com/blog/
SmILE subcluster • http://smile.deri.ie/ & http://smile.deri.ie/blog/ • SmILE stands for Semantic Information Systems and Language Engineering Group • Group leader: Dr. Siegfried Handschuh • „focused around the application of Semantic Web and Language Engineering techniques to support knowledge acquisition and re-use in different settings“ • leading project:NEPOMUK(Networked Environment for Personal Ontology-based Management of Unified Knowledge) • NEPOMUK aims to build a Social Semantic Desktop which will present information in a well defined manner, which will be processible by computer, and which will connect and exchange data with other desktops
DINO ontology lifecycle scenario and framework • DINO stands for „Dynamics, INtegration and Ontology“ or „Data and INtensive Ontology“ • is a scenario and framework for practical handling of dynamic and large data-sets in an ontology lifecycle, focusing particularly on dynamic integration of learned knowledge into collaboratively developed ontologies
Ontology development • ontologies are very likely subject to change given the dynamic nature of domain knowledge • ontology construction is usually the result of collaboration • it is not always feasible to process all the relevant data and extract the knowledge from them manually • this implies a need for (partial) automation of ontology extraction and management processes in dynamic and data-intensive environments • this can only be achieved by ontology learning
DINO ontology integration • based on • Dynamic Integration of Medical Ontologies in Large Scale, Novacek, V.; Laera, L.; Handschuh, S.; article • much more details in • D2.3.8v1 Report and Prototype of Dynamics in the Ontology Lifecycle
DINO ontology integration • scheme of the integration process • phases of the integration • providing a master ontology • providing an extending ontology • alignment/negotiation • reasoning/management • ontology diff • triple sorter • mapping triples to natural language suggestions
DINO phases of integration • providing a master ontology • providing an extending ontology • ontology learning • machine learning and NLP methods are used for a processing of relevant resources and extracting knowledge from them • is realised using Text2Onto • any “external” ontology can be provided • we can integrate e.g. different ontologies from the same domain or specialised/general ontologies
DINO phases of integration • alignment/negotiation • provided ontologies need to be reconciled since they cover the same domain, but might be structured differently • contsists of mappings between the concepts, properties, and relationships in provided ontologies • uses ontology alignment API developed by INRIA Rhone-Alpes
DINO phases of integration • reasoning/management • used for merging of the provided ontologies according to statements in an „alignment ontology“ • the „alignment ontology“ consists of axioms merging classes, individuals and properties • handles inconsistencies like sub-class hierarchy cycles, disjointness-subsumption, disjointness-instantiation • resulting ontology is passed to an ontology diff • uses Jena 2 Ontology API
DINO phases of integration • ontology diff • possible ontology extensions are equal to the additions that the merged ontology brings into the master ontology • the addition triples form a base to eventual ontology extension suggestions
DINO phases of integration • triple sorter • applies an ordering taking a relevance measure of possible suggestions into account (based on preferred and unwanted terms)
DINO phases of integration • mapping triples to natural language suggestions • produced suggestions are in a form of very simple natural language statements which are obtained directly from the sorted triples
DINO integration manager • original plans • DINO should have been a part of MarcOnt portal initially • MarcOnt portal (http://www.marcont.org/) • an environment for collaborative ontology development being developed at DERI, Galway • DINO as a bunch of cooperating Protege(-OWL) plugins • Semantic Version Manager plugin • Protege plugin built upon SemVersion • Collaborative Protege • problems with 3rd party libraries used in Text2Onto and GATE • reality • DINO as a stand-alone Java application
DINO integration manager • Demo
Semantic web for Java developers • interesting Java tools handling Semantic web technologies I encountered or had to deal with directly • SemVersion • RDF2Go • RDFReactor
SemVersion • http://wiki.ontoworld.org/wiki/SemVersion • developed mainly by Max Voelkel • a versioning system for RDF and RDF ontologies • backed by Sesame 2 (since v1.0.0 alpha) • enables to • version RDF models • commit and merge RDF models • Semantic Version Manager • an implementation of SemVersion as a Protege plugin
RDF2Go • http://wiki.ontoworld.org/wiki/RDF2Go • an abstraction over triple (and quad) stores • allows a programmer to code against RDF2Go interface and thus to stay independent of the underlying RDF store • supported RDF stores • Jena 2.4 • Sesame 2.0 beta 6 (the latest release) • used in • SemVersion • Aperture
RDF2Go • RDF2Go example code:
RDFReactor • http://wiki.ontoworld.org/wiki/RDFReactor • a view of RDF data through object-oriented Java proxies making using RDF natural for Java developers • „Think in objects, not statements.“ • all state information is in a RDF model in a RDF store at all times • RDFReactor Java proxies are stateless • Java proxies are generated automatically from RDF Schema
RDFReactor • example code:
The very last slide • Any (other) questions? Thank you for your attention!