140 likes | 296 Views
Break Out Session on Infrastructure and Technology: A Report. Vipul Kashyap AOS Workshop, Rome, 15 November 2001 vipul_kashyap@yahoo.com. Outline. A “Template” Architecture for the AOS System Components of the Architecture Tools, Techniques, Algorithms and Software for the Architecture
E N D
Break Out Session on Infrastructure and Technology: A Report Vipul KashyapAOS Workshop, Rome, 15 November 2001vipul_kashyap@yahoo.com
Outline • A “Template” Architecture for the AOS System • Components of the Architecture • Tools, Techniques, Algorithms and Software for the Architecture • Recommendations: Priorities
User Query/ Information Request User Query/ Information Request User Query/ Information Request Inter-Ontology Relationships Manager ... ... DATA REPOSITORIES DATA REPOSITORIES A “Template” Architecture for the AOS System Ontology Ontology Server Server Integration Infrastructure (J2EE, Agents) Information System 1 Information System N
Components of the Architecture • Tools and Techniques for Ontology Building • Tools and Techniques to associate ontologies with underlying data • Technologies for Distributed Query Processing/Search • Tools and Techniques Distributed Ontology Integration/Interoperation • Tools and Techniques for Ontology Maintenance and Versioning • Integration Infrastructure • Back end technologies to store data and information repositories • User Interface Issues
Tools and Techniques for Ontology Building • Tools and Process for build ontology from scratch • Data Model Specific, e.g., EER, Object Oriented, RDF(S), DAML+OIL) • InfoSleuth Ontology Editor, OKBC Editor, Protégé, OntoEdit (Free and commercial), Ontology Builder • I-logix, Uniting Software Design, Tigris (UML based tools) • Open source tools available from http://www.semanticweb.org, http://www.daml.org • Enhancement of Existing KOSs into domain specific ontologies • Enhancement of database schemas (relational, object oriented) to ontologies • ERWin: generates E-R models from database schemas • Enhancement of thesauri, glossaries, subject headings, controlled vocabularies, classification lists to develop ontologies • No known software tools • Design a process of inter-agency collaboration for building AOS based on existing KOSs • Process designed in the context of the EDEN System
Tools and Techniques to associate ontologies with underlying data • Tools for mapping ontological concepts to database schemas • OR mapping tools (J2EE suite) • InfoSleuth/EDEN mapping tools, Kaon-Reverse, • Tools for annotating documents (and fragments) with ontological concepts and relationships • IKA class of software, Onto-Mat • Tools for dealing with multi-lingual ontologies • E.g., OntoEdit, Kaon-Soep • Tools for annotating images with ontological concepts • No known software • Generation of Websites from Ontology • E.g., Semantic Miner • Data Mining/Classification/Ontology Learning Tools • E.g., Decision Tree based algorithms, C4.5, e.g., Whizbang! • Neural Networks, Statistical Clustering, Latent Semantic Indexing • Learning based annotation of documents • E.g., TextToOnto
Technologies for Distributed Query Processing/Search • Distributed indexing, meta-search • AGRIS multi-host search engine • Ontology-based, multi-resource distributed query processing (pull) • Federated database technology, e.g., Carnot, Mermaid, InfoSleuth/EDEN, Interbase • Ontology-based event notification and subscription (push) • E.g., InfoSleuth/EDEN agent-based approach, Oracle Triggers • Multimedia Search: e.g., specify query using image, get images, text documents, etc. • E.g., IBM, Virage
Tools and Techniques Distributed Ontology Integration/Interoperation • Tools for Mappings between various Thesauri and enabling their convergence • No Known Software • Tools for Ontology Brokering • No Known Software • Identification of inter-ontology terminological relationships • E.g., 20 candidate subject relationships for information retrieval • Identify the unit of re-use: • Inclusion of sub-ontologies, concepts, aggregations/reifications • Integration of community partner subject ontologies with the AOS ontology, tools for managing a federated ontology structure • No Known Software • Algorithm for query processing by re-using the above relationships (query re-writing) • E.g., ONION (Stanford), OBSERVER project • E.g., Protégé (Manual Merging for Ontologies, Stanford)
Tools and Techniques for Ontology Maintenance and Versioning • No Known Software • Tools and Techniques for maintenance of inter-ontological relationships • No Known Software
Integration Infrastructure • Component based technologies: • E.g., J2EE, .NET, COM • Agent based Systems • E.g., InfoSleuth/EDEN, FIPA • Markup/Representation languages • Agent-based, E.g., OKBC, KIF, KQML • Web-based, E.g., XML, RDF(S), DAML+OIL, DRDFS (based on conceptual graphs) • Web Services Technology • E.g., WSDL + UDDI + SOAP, ebXML • Important Criteria for evaluation: • Scalability, Performance, Fail Over, Recovery
Back end technologies to store data and information repositories • Structured Data • Relational Databases • E.g., Oracle, DB2, MySQL, SQL Server • Object Oriented Databases • E.g., ObjectStore, Versant, Poet • Textual Data • E.g., Verity, Documentum, Isis, Basis • Web Sites and Related Development Tools • Template driven websites, ASPs, JSPs… • Knowledge Bases for Ontology Storage and Inferencing • ICS-FORTH (RDF Suite, No inference), SESAME, 4SUITE, RDF-DB • E.g., KL-ONE Systems, CLASSIC, BACK, … • E.g., Allegra (Common Lisp based) • Object Oriented Databases • Relational Databases • Repositories for managing vocabularies and thesauri • E.g.. LEXICON, MultiTes, Knowledge Map
User Interface Issues • Type of Users • Browsing, Ontology-based Navigation, keywords • Exposure to query language. Eg.. SQL, DL?, TQML • Visualization • Ontologies • E.g., FRODO • Queries • Results
Priorities for Various Components • T&T for Ontology Building (1) • T&T for associating ontologies with underlying data (2) • User Interface and Visualization (3) • T&T for Multi-ontology interoperation (4) • T&T for Ontology (and Inter-Ontology Relationships) Maintenance and Versioning (5) • T&T Distributed Query and Search (6) • Underlying Integration Infrastructure (7) • Back End Technologies (8)
Criteria • Transaction/Scalability/Performance • Classification Accuracy • Open Source, Internationalization • Industry Standards/Interoperability • Easy to make future extensions • Multimedia • Other ontologies