330 likes | 449 Views
The Semantic Web (State of the art and implications for language processing). Deborah McGuinness Associate Director and Senior Research Scientist Knowledge Systems Laboratory Stanford University Stanford, CA USA dlm@ksl.stanford.edu http://www.ksl.stanford.edu/people/dlm. Outline.
E N D
The Semantic Web(State of the art and implications for language processing) Deborah McGuinness Associate Director and Senior Research Scientist Knowledge Systems Laboratory Stanford University Stanford, CA USA dlm@ksl.stanford.edu http://www.ksl.stanford.edu/people/dlm AAAI July 28, 2002
Outline • Vision of the web today and tomorrow • The key to tomorrow’s web is semantics • Semantics on the web requires: • Language for encoding meaning (DAML+OIL, OWL) • Ontologies • Tools • Conclusion and Pointers AAAI July 28, 2002
Human Human Today: Rich Information Source for Human Manipulation/Interpretation AAAI July 28, 2002
“I know what was input” • Global documents and terms indexed and available for search • Search engine interfaces • Entire documents retrieved according to relevance (instead of answers) • Human input, review, assimilation, integration, action, etc. • Special purpose interfaces required for user friendly applications The web knows what was input but does little interpretation, manipulation, integration, and action. Analogous to a new assistant who is thorough yet lacks common sense, context, and adaptability AAAI July 28, 2002
Human Agent Agent Tomorrow: Rich Information Source for Agent Manipulation/Interpretation AAAI July 28, 2002
“I know what was meant” • Understand term meaning and user background • Interoperable (can translate between applications) • Programmable (thus agent operational) • Explainable (thus maintains context and can adapt) • Capable of filtering (thus limiting display and human intervention requirements) • Capable of executing services AAAI July 28, 2002
Layer Cake Foundation AAAI July 28, 2002
Stated goals of Semantic Web • Define conventions for applications that exchange metadata on the Web • Enable vocabulary semantics to be defined by communities of expertise, not W3C • Provide for the fine-grained mixing of diverse metadata • Making it cost-effective for people to effectively record their knowledge. • Ultimate goal - the design of enabling technologies to support machine facilitated global knowledge exchange AAAI July 28, 2002
Ontologies DAML/OWL-enabled web pages Semantic Markup Languages such as OWL, DAML+OIL (http://www.w3.org/2001/sw/WebOnt/, http://www.daml.org) • Encoding background info • User modeling info • Annotating web pages • Annotating services thereby limiting needs for human disambiguation input, human interpretation, multiple answer display, translation assistance, agent assistance, adaptivity support, etc.) AAAI July 28, 2002
XML • World Wide Web Consortium (W3C) standard • Provides important solution to syntax problem and simple semantics and schemas: <SSN>555-17-1234</SSN> • Now we can describe the meaning of words • Many applications of XML appearing: • Geographic Markup Language (GML) • Extensible rights Markup Language (XrML) • Chemical Markup Language (CML) Problem: Limited semantics, limited ontology creation AAAI July 28, 2002
DARPA Agent Markup Language • http://www.daml.org/about.html • Extends the vocabulary of XML and RDF/S • Provides rich ontology representation language • Language features chosen so language may have efficient implementations AAAI July 28, 2002
DARPA DAML Program • Began in August 2000 Kickoff meeting • 19 Research groups supported • Initial ontology language aims to extend XML, RDF/S, benefit from frames, benefit from principled KR systems like Description Logics • DAML-ONT released in Oct. 2000 • DAML+OIL released in March 2001 AAAI July 28, 2002
DAML Language • Web Languages • RDF/S • XML DAML-ONT DAML+OIL (OWL) OIL Formal Foundations Description Logics Frame Systems FACT, CLASSIC, DLP, … AAAI July 28, 2002
DAML+OIL -> W3C • W3C Webont working group formed with DAML+OIL submission as starting point http://www.w3.org/Submission/2001/12/ AAAI July 28, 2002
W3C Web Ontology Working Group • Web Ontology Working Group in the W3C Semantic Web Activity aimed at “extending the semantic reach of current XML and RDF meta-data efforts. “ • History • W3C Announcement in November 2001 - http://lists.w3.org/Archives/Public/www-rdf-logic/2001Nov/0000.html • Weekly teleconferences starting in November 2001 AAAI July 28, 2002
WEBONT catches on…. • Includes over 50 members from over 30 organizations. • Industry including: • Large companies such as Daimler Chrysler, EDS, Fujitsu, HP, IBM, Intel, Lucent, Nokia, Philips Electronics, Sun, Unisys, … • Newer/smaller companies such as IVIS Group, Network Inference, Stilo Technology, Unicorn Solutions, … • Government and Not-For-Profits: • Defense Information Systems Agency, Interoperability Technology Association for Information Processing, Japan (INTAP) , Intelink Mgt Office, Mitre, … • Universities and Research Centers: • University of Bristol, University of Maryland, University of Southamptom, Stanford University, … • DFKI (German Research Center for Artificial Intelligence), Forschungszentrum Informatik • Invited Experts • Well-known academics from non-W3C members AAAI July 28, 2002
WEBONT cont. • Quarterly Face to Face meetings in • Murray Hill: http://www.w3.org/2001/sw/WebOnt/ftf1.html • Amsterdam: http://www.w3.org/2001/sw/WebOnt/ftf2.html • Stanford: http://www.w3.org/2001/sw/WebOnt/ftf3.html • Upcoming – Bristol, and xxx • Interesting Documents: • DAML+OIL submission – full spec with reference description, walkthrough, FOL and model theoretic semantics, http://www.w3.org/TR/daml+oil-reference • Use Case and requirements document: http://www.w3.org/TR/webont-req/ AAAI July 28, 2002
OWL Lite and OWL • Feature Synopsis: http://www.w3.org/TR/owl-features/ • Reference Description: http://www.w3.org/TR/owl-ref/ • Abstract Syntax: http://www.w3.org/TR/owl-absyn/ AAAI July 28, 2002
OWL Lite Features • RDF Schema Features • Class • rdf:Property • rdfs:subClassOf • rdfs:subPropertyOf • rdfs:domain • rdfs:range • Individual • Equality and Inequality • sameClassAs • samePropertyAs • sameIndividualAs • differentIndividualFrom • Restricted Cardinality • minCardinality (restricted to 0 or 1) • maxCardinality (restricted to 0 or 1) • cardinality (restricted to 0 or 1) AAAI July 28, 2002
OWL Lite Features (cont) • Property Characteristics • inverseOf • TransitiveProperty • SymmetricProperty • FunctionalProperty(unique) • InverseFunctionalProperty (unambiguous) • allValuesFrom (universal local range restrictions; previously toClass) • someValuesFrom (existential local range restrictions; previously hasClass) • Datatypes • Following the decisions of RDF Core. • Header Information • imports • Dublin Core Metadata • versionInfo AAAI July 28, 2002
OWL Features • Class Axioms • oneOf (enumerated classes) • disjointWith • sameClassAs applied to class expressions • rdfs:subClassOf applied to class expressions • Boolean Combinations of Class Expressions • unionOf • intersectionOf • complementOf • Arbitrary Cardinality • minCardinality • maxCardinality • cardinality • Filler Information • hasValue Descriptions can include specific value information AAAI July 28, 2002
Tools In order to take off, we need tools: http://www.daml.org/tools/ Annotation Ontology Translation Browser Persistence Crawler Query Tools Editor RDMS Mapping Graph Visualizer Report Generation Transformation Search Validator Ontology Analyzer Importer Ontology Editor Inference Engine Many are in research labs, but some in companies… Network Inference, Sandpiper, Ontoprise, …. AAAI July 28, 2002
Ontologies Exploding • Upper Level Ontologies Emerging: • UNSPSC, SUO, OpenCyc, OpenDirectory, TAP, … • Specialized Ontologies Emerging • UMLS, NPC, • Libraries • http://www.daml.org/ontologies/ • http://www.ksl.stanford.edu/ontolingua AAAI July 28, 2002
Some Observations… • Markup Languages are growing in acceptance and expressive power • User base, tool base, ontology base growing • Ontology-enhanced applications springing up (not just in ivory towers like FindUR, eCyc, …) AAAI July 28, 2002
Simple Ontology-Enhanced Apps AAAI July 28, 2002
Applied Semantics Applied Semanticsuses a large scale ontology, or knowledge base of concepts and their relationships, to bring semantic understanding to the processing of unstructured information. Our software products and services improve the business processes for publishing, enterprise applications, and internet infrastructure markets by automating content tagging, categorization, and summarization for more effective information sharing and retrieval. • Founded in 1998 • Won Internet World Fall 1999 “Best of Show” for meaning-based search • 40 employees • Funding from: Zero Gravity, Ridgestone, others • 50+ customers, including: AAAI July 28, 2002
CIRCA in Publishing CIRCA Technology U S E R S Auto-Categorizer Proprietary Content Management System News Content Metadata Creator Page Summarizer A major newspaper uses Auto-Categorizer by IPTC code (standard publishing taxonomy), Metadata Creator to generate meaningful thematic keywords, and Page Summarizer to summaries of varying lengths. AAAI July 28, 2002
Conclusion/Discussion • The Semantic Web is in its infancy today but is ready for applications • Markup Language, Ontologies, and some tools are ready for use • Hybrid applications may be the first to grow like ontology-enhanced search, ontology-enhanced knowledge capture, etc. • Lets get together….. AAAI July 28, 2002
Extras AAAI July 28, 2002
What is an Ontology? Thesauri “narrower term” relation Frames (properties) Formal is-a General Logical constraints Catalog/ ID Informal is-a Formal instance Disjointness, Inverse, part-of… Terms/ glossary Value Restrs. AAAI July 28, 2002
Some Pointers • Ontologies Come of Age Paper: http://www.ksl.stanford.edu/people/dlm/papers/ontologies-come-of-age-abstract.html • OWL: http://www.w3.org/TR/owl-features/ , http://www.w3.org/TR/owl-ref/ • DAML+OIL: http://www.daml.org/ , http://www.w3.org/TR/daml+oil-reference AAAI July 28, 2002
Contact Information dlm@ksl.stanford.edu www.ksl.stanford.edu/people/dlm AAAI July 28, 2002