320 likes | 443 Views
Information Integration and the Semantic Web Finding knowledge, data and answers. Tim Finin 1 , Anupam Joshi 1 , Li Ding 2 1 University of Maryland, Baltimore County 2 Stanford University, Knowledge Systems Lab
E N D
Information Integration and the Semantic WebFinding knowledge, data and answers Tim Finin1, Anupam Joshi1, Li Ding2 1 University of Maryland, Baltimore County 2 Stanford University, Knowledge Systems Lab Joint work with Yun Peng, Cynthia Parr, Andriy Parafinyk, Lushan Han, Pranam Kolari, Pavan Reddivari, Rong Pan, Akshay Java, Joel Sachs and others. http://ebiquity.umbc.edu/resource/html/id/327/ http://creativecommons.org/licenses/by-nc-sa/2.0/ This work was partially supported by DARPA contract F30602-97-1-0215, NSF grants CCR007080 and IIS9875433 and grants from IBM, Fujitsu and HP.
tell register But what about our agents? Agents still have a very minimal understanding of text and images.
Swoogle Swoogle Swoogle Swoogle Swoogle Swoogle Swoogle Swoogle Swoogle Swoogle Swoogle Swoogle Swoogle Swoogle Swoogle tell register But what about our agents? A Google for knowledge on the Semantic Web is needed by software agents and programs
Information Integrationand the Semantic Web • The Semantic Web enables information integration with standards supporting shared semantic models, ontology mapping, common tools, etc. • A Google-like global index can help people and programs to • Find Semantic Web ontologies and data • Understand how these are being used • Build trust and provenance models • Assemble ontology maps • Create new integration tools
http://swoogle.umbc.edu/ • Running since summer 2004 • 1.8M RDF docs, 320M triples, 10K ontologies,15K namespaces, 1.3M classes, 175K properties, 43M instances, 600 registered users
Applications and use cases Supporting Semantic Web developers • Ontology designers, vocabulary discovery, who uses what ontologies & data, use analysis, errors, statistics, etc. Helping scientists publish and find data • Spire: aggregating observations and data from biologists • InferenceWeb: searching over and enhancing proofs • SemNews: Text Meaning of news stories Supporting SW tools • Triple shop: finding data for SPARQL queries 1 2 3
80 ontologies were found that had these three terms By default, ontologies are ordered by their ‘popularity’, but they can also be ordered by recency or size. Let’s look at this one
All of this is available in RDF form for the agents among us.
Here’s what the agent sees. Note the swoogle and wob (web of belief) ontologies.
2 • An NSF ITR collaborative project with • University of Maryland, Baltimore County • University of Maryland, College Park • University of California, Davis • Rocky Mountain Biological Laboratory
Invasive Species Invasive species cost the U.S.economy over $138 billion per year By various estimates, these speciescontribute to the decline of 35% - 46% of U.S. endangered and threatened species The invasive species problem is growing, as the number of pathways of invasion increases. Pimental et al. 2000 Environmental and economic costs associated with non-indigenous species in the United States. Bioscience 50:53-65. Charles Groat, Director U.S. Geological Survey, http://www.usgs.gov/invasive_species/plw/usgsdirector01.html
East River Valley Trophic Web http://www.foodwebs.org/
Biologists Gathering data • Increase utility • Maximize productivity • Foster discovery • Broaden participation
Representing and sharing data Journal articles Flat files Spreadsheets Local databases On the Web in HTML or XML
ELVIS: Ecosystem Localization, Visualization, and Integration System Oreochromis niloticus Nile tilapia Bacteria Microprotozoa Amphithoe longimana Caprella penantis Cymadusa compta Lembos rectangularis Batea catharinensis Ostracoda Melanitta Tadorna tadorna Food web constructor Species list constructor ? . . .
ELVIS Food Web Constructor predicts basic network structure Prelude to systems models
Examine evidence for predicted links. The Evidence Provider lets users explore evidence (data, papers, reasoning) for food web links
Supporting ontologies and their use SpireEcoConcepts, for confirmed and potential food web links bibliographic information of food web studies ecosystem terms taxonomic ranks California Wildlife Habitat Relationships Ontology life history geographic range management information ETHAN (Evolutionary Trees and Natural History) Natural history information on species derived from data in the Animal Diversity Web and other taxonomic sources
OWL UMBC Triple Shop 3 RDF • http://sparql.cs.umbc.edu/ • Online SPARQL RDF queryprocessing with several interesting features • Automatically finds data for queries using Swoogle • Datasets, queries and results can be saved, tagged, annotated, shared, searched for, etc. • RDF datasets as first class objects • Can be stored on our server or downloaded • Can be materialized in a database or(soon) as a Jena model RDF query language
What are body masses of fishes that eat fishes? Triple Shop . . . leaving out the FROM clause
We’ll run the query against this dataset to see if the results are as expected.
Results http://sparql.cs.umbc.edu/tripleshop2/
Looks like a useful dataset! • Let’s annotate, tag and save it and also materialize it the TS triple store. • Queries can also be annotated, tagged and shared.
Themes revisited • The Web contains the world’s knowledge in forms accessible to people and computers • The Semantic Web enables information integration with standards supporting shared semantic models, ontology mapping, common tools, etc. • We need better ways to discover, index, search and reason over knowledge on the Semantic Web • Swoogle-like systems help create consensus ontologies, foster best practices, find data and support tools.
For more information http://ebiquity.umbc.edu/ Annotatedin OWL