360 likes | 494 Views
The Semantic Web: A network of understanding. Jim Hendler Univ of MD/RPI http://www.cs.umd.edu/~hendler. Outline. The Semantic Web Past Present Future. May, 2001. March, 2000. May, 1994.
E N D
The Semantic Web:A network of understanding Jim Hendler Univ of MD/RPI http://www.cs.umd.edu/~hendler
Outline • The Semantic Web • Past • Present • Future May, 2001 March, 2000 May, 1994
Semantic Web hypothesis: Heterogeneous Web-based Information Resources can be connected by Web-based knowledge models PVT Burkitt’s Lymphoma Rearrangement of a DNA sequence homologous to a <cell-type>cell-virus junction fragment </cell-type>in several<disease>Moloney murine leukemia</disease> virus-induced <organism>rat</organism> thymomas PubMed Oncogene(MYC): Found_In_Organism(Human). Gene_Has_Function(Transcriptional_Regulation). Gene_Has_Function(Gene_Transcription). In_Chromosomal_Location(8q24). Gene_Associated_With_Disease(Burkitts_Lymphoma). Semantic Web PVT Burkitt’s Lymphoma Rearrangement of a DNA sequence homologous to a cell-virus junction fragment in several Moloney murine leukemia virus-induced rat thymomas 8q24 PVT1 PubMed
Web ontologies • Web Ontologies are models allowing the linking of • multimedia • databases • services • Web services • Grid computing • meta-data repos • Or any other Web resource! • Other ontologies • Anything with a URI
The "layercake" T. Berners-Lee, 2001
Funded Research WG activity Recommendation 2001 • Research, experimentation, early demonstrations • Reminiscent of the early days of the Web
Funded Research WG activity Recommendation 2003 • Early government adoption • Emerging corporate interest
Funded Research WG activity Recommendation 2005 • Commercial tools • Lots of open source software • Scalability
Significant Corporate Activity • Semantic (Web) technology companies starting & growing • Siderean, SandPiper, SiberLogic, Ontology Works, Intellidimension, Intellisophic, TopQuadrant, Data Grid, … • Bigger players buying in • Adobe, Cisco, HP, IBM, Nokia, Oracle, Sun, Vodaphone… announcements/use in 2005-2006 • Gartner identifies Corporate Semantic Web as one of three "High impact" Web technologies • tools being announced: AllegroGraph, Altova, TopBraid, … • Government projects in and across agencies • US, UK, EU, Japan, Korea, … • Life sciences/pharma an increasingly important market • Health Care and Life Sciences Interest Group at W3C • Many open source tools available • Kowari, RDFLib, Jena, Sesame, Protégé, SWOOP, Onto(xxx), Wilbur, …
Richer metadata Embedded meta-data Data harvesting & visualization Enterprise data integration "Corporate Semantic Web", Gartner "hot pick" for 2006
Digital asset management Semantic Web portals Ontology editors (and other tools) Semantic Web and social networking
Significant Corporate Activity 50+ Semantic Web press releases each month
Significant Government Activity • Agencies moving beyond the "talk" phase • primarily prototyping, but first acquisitions starting • Example: • NASA is developing an enterprise data strategy around using existing data via Semantic Web integration (A. Schain, 3/06)
There's a Lot Out There! Paid ads 2,120,000 hits on "RDF filetype:rdf" 13,600 hits on "ontology filetype:owl" (March, 2006)
Where we are today • Survey of 1300 OWL ontologies found by crawl • Wang 06 • 19 ontologies with 2000+ classes • 6 ontologies with 10000+ classes • 2 ontologies with50000+ classes • CYC, NCI
Swoogle http://swoogle.umbc.edu
Some "Swoogle" observations The OWL namespace has been declared by 113,000 SWDs (8%) and actually used by 108,000 (7%). The RDFS namespace enjoys more use, being declared by 677,000 (47%) and used by 538,000 (37%) SWDs. Owl:Class is the most used term from the OWL namespace with ~ 1,800.000 instantiations in 68,000 SWDs We also noticed significant use of two OWL equality assertions: owl:sameAs (280,000 assertions in 17,00 SWDs) and owl:equivalentClass (70,000 assertions in 4,300 SWDs). Their common use may be an indication of increased ontology alignment. (Ebiquity blog, Sept 1, 2006)
The cake is evolving as well.. (Tim Berners-Lee) (Tim Berners-Lee) 2001 2006
New languages underway • SPARQL • Query language for (distributed) RDF triple stores • The SQL of the Semantic Web • GRDDL/RDFa • Integration of HTML world and Semantic Web • Means for "embedding" RDF-based annotation on traditional Web pages • Means for generating RDF triple stores from (annotated) Web pages • RIF • Rules interchange format • Representing rules on the Web • Linking rule-based systems together • And more • Multimedia annotation, Web-page Metadata annotation, Health Care and Life Science (LSID), Privacy
Built in pieces at different times Linked together for greater effect
The World Wide Web Built in pieces at different times Linking of "Web Islands" Linked together for greater effect
Linking is power! <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE rdf:RDF [ <!ENTITY feleuk.owl "http://www.mindswap.org/ontologies/feleuk.owl"> <!ENTITY owl "http://www.w3.org/2002/07/owl#"> <!ENTITY rdf "http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <!ENTITY rdfs "http://www.w3.org/2000/01/rdf-schema#"> <!ENTITY NCI "http://www.ncibi.nih.gov/NCIT/NCIT.owl#"> <!ENTITY CYC="http://www.cyc.com/2004/06/04/cyc#"> ]> <rdf:RDF xml:base="&feleuk.owl;" xmlns:owl="&owl;" xmlns:rdf="&rdf;" xmlns:rdfs="&rdfs;" xmlns:NCI="&NCI;" xmlns:CYC="&CYC;"> <owl:Ontology rdf:about="" rdfs:label="Feline Leukemia" owl:versionInfo="Feline Leuk 1.0"/> <owl:Class rdf:about="#Feline-Leukemia"> <rdfs:subClassOf rdf:resource="NCI:Leukemia"/> <rdfs:subClassOf> <owl:Restriction> <owl:allValuesFrom rdf:resource="CYC:cat"/> <owl:onProperty rdf:resource="#NCI:diseased-organism"/> </owl:Restriction> </rdfs:subClassOf> </owl:Class> </rdf:RDF> Link to 45000 terms at NCI Link to 47000 (Open)CYC terms
Linking is power • Today we can find thousands of ontologies • Available on the Web • Linked to Web resources • Linked to data resources • Linked to each other • Linked to Web 2.0-like annotations • And billions of annotated (semi-Knowledge engineered) objects • Available on the Web • Linked to Web resources • Linked to data resources • Linked to each other • Linked to the ontologies We must link these together for great effect!!
A key opportunity • Vast amounts of "semi-engineered" knowledge • Flickr: tens of millions of keyword tagged photos • Wikipedia: thousands of carefully documented subjects (in a hierarchy, with disambiguation, …) • Etc. etc. etc. • With "persistent" URIs • "tank" http://en.wikipedia.org/wiki/Tank (armament) • "tank" http://en.wikipedia.org/wiki/Tank%2C_Pakistan (small town in Pakistan) • And anything with a URI can be linked to the Semantic Web!!!!!
For exciting linking possibilities • Linking of Web 2.0 and Semantic Web • Using informal KE to bootstrap "formal" KE • Extending formal KE from Web 2.0
Evolving vision Documents, linked to Images, annotated with Ontologies, linked to Other ontologies, describing Databases, exported as RDF graphs, as input to Services, which designate Documents, linked to … (ad infinitum) Stay tuned… 2001 2000 1994
Semantic Web Challenges • Today's Semantic Web Languages • Are not-very-expressive-KR-language standards • Not KIF, or even KL-ONE • Create non-persistent knowledge bases • Servers come and go • Ontologies change over time • And can't be kept consistent • Disagreement, error, dishonesty…
Semantic Web opportunities • Today's Semantic Web Languages • Are not-very-expressive-KR-language standards • Like HTML is to SGML • Create non-persistent KBs • Like the 404 error (w/o which there is no Web) • And can't be kept consistent • Like blog-space and Web 2.0 • We need to accept, and more importantly exploit, these features
Note to Grad students (and their advisors) • The Semantic Web today, esp at the ontology layer, is like the Web with no one using <a href=…> • What makes the Web, the Web • Please, No more one ontology, one domain, one set of services, one … Theses • There's a reason we built this stuff on top of RDF and URIs The network effect is where the power is!
A few of the many things I've left out • Semantic Web Services • Crucial for linking "programs" into the mix • Semantic Web tools and scaling issues • Engineering approaches being used to scale Semantic Web stores to database sizes • Information extraction and Semantics • Can we "retrofit" semantics on the existing Web • Semantic Web Information Creations • Can we make it so we don't have to retrofit future Web? • Other information resources • Personal data, unstructured resources, off-line collection information, digital libraries, … • There's more that isn't on the Web than is on it! • New Web use patterns • Social networks, blogs, wikis, … • … are all fertile areas for Semantic Web exploration
Conclusion • The Semantic Web is real • Tremendous progress in the past five years • Lots of it is out there • Growing support in industry and govt use • Development continues • Easy to get involved • Many open source tools • New languages and techniques reaching critical mass • The next steps are exciting • The "network effect" of linking to other Semantic Web resources • … and to non-Semantic Web resources • And research opportunities still abound • Scaling • Inconsistency • Access and acquisition