280 likes | 442 Views
CS690L Semantic Web and Knowledge Discovery: Concept, Technologies, Tool. Yugi Lee STB #555 (816) 235-5932 leeyu@umkc.edu www.sice.umkc.edu/~leeyu This presentation was designed based on SWWS-01 symposium report and Farquhar’s Ontology tutorial. Semantic Web?.
E N D
CS690LSemantic Web and Knowledge Discovery: Concept, Technologies, Tool Yugi Lee STB #555 (816) 235-5932 leeyu@umkc.edu www.sice.umkc.edu/~leeyu This presentation was designed based on SWWS-01 symposium report and Farquhar’s Ontology tutorial. CS690L - Lecture 2
Semantic Web? • "The Semantic Web is an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation." -- Tim Berners-Lee, James Hendler, Ora Lassila, The Semantic Web, Scientific American, May 2001 CS690L - Lecture 2
Semantic Interoperability • Ihe interoperability layer to migrate from the syntactic to the semantic! • From Data space to Knowledge space • Integration and composition CS690L - Lecture 2
Interoperability • Object Interoperability: • This is the layer at which the current middleware products are aimed in the industry. • However these objects are primarily defined as containers for software and for streamlining the software development process. • The CORBA, EJB object models are examples of standards at this layer. CS690L - Lecture 2
Interoperability • Meta-Model Interoperability: • This is the layer at which the cross-over from the "data" space to the "knowledge" space takes place. • The objects here are viewed as containers of knowledge to be fleshed out by upper layers. • The OKBC and RDF(S) core models are examples of standards at this layer. CS690L - Lecture 2
Interoperability • Ontology Interoperability: • This is the layer where ontologies, schemas and classifications are built upon common underlying standardized meta-models. • The ability to use different ontologies to specify and query information constitutes interoperability at this layer. • Ontology Standardization CS690L - Lecture 2
Interoperability • Meta-Data (View/Query) Interoperability: • Semantic metadata descriptions can be constructed from one or more underlying ontologies. • Issues at this layer would be to decompose information requests into those supported by the individual semantic metadata descriptions corresponding to the information sources. • Ontology Query Language CS690L - Lecture 2
Interoperability • Process/Services Interoperability: • Semantic process/service descriptions can be constructed from resources on the Web and one or more underlying ontologies. • Issues at this layer would be to enable a better discovery, selection, composition, monitoring, and interoperability of services/process. • A resource description, informally called its “semantics”, includes that information about the resource that can be used by computers - not just for display purposes, but for using it for automatic processing in various applications. • Service Ontology, Semantic web service workflow, Web Service Discovery, addressing semantic heterogeneity handling, QoS specification for Web Services and Processes. CS690L - Lecture 2
Practical Motivation: Semantic Web/Application • The Semantic Web is more than simply some sort of academic foolishness or rewarmed AI vision. • The applications showed real technology and tools are being built in the Semantic web community, and that there is a lot of interest in these technologies on the part of industry and government. • The web services community showed one area where there is tremendous industrial interest and where semantic web technology could be an important part of the work. CS690L - Lecture 2
Challenging Example: Query Answering • “How many acres of cotton are planted in China?” • Response from today’s Web – Some documents -- some of which may contain the answer -- somewhere • Response from the Semantic Web of the future –“15,485,000 in 2000, says the USDA” • Deductive query answering rather than document retrieval • Ontologies will be a primary source of knowledge for reasoning: enable derivation of answers not explicitly on a Web site • Even simple Web sites may reference large and distributed ontologies: a challenge to query-answering reasoners • Ontologies could include special purpose query-answering reasoners • For proving instances of atomic formulas in the ontology’s vocabulary and making inferences from sentences in the ontology’s vocabulary • Requires an API for special purpose reasoners CS690L - Lecture 2
Challenging Example: Semantic Search Tap’s Semantic Search (Stanford University) • Retrieves real-time data relevant to a quer: Determines the “semantic type” of individuals in the query, Uses models of relevancy based on types • Uses a background ontology and large KB of individuals • ~3,000 class and ~72,000 individuals, Downloadable in RDF-S or DAML+OIL • Can be augmented with use-specific and user-specific ontologies and used to retrieve data relevant to a task being performed CS690L - Lecture 2
Ontology - What Is an Ontology? [A. Farquhar] • To communicate, plan, think we need aconceptualization of the world • What kinds of things are there? What are their properties? What are their relationships? • These things define our ontology • We all have ontologies (e.g., of organizations, computers, animals) • Some are very idiosyncratic. Some are shared! • Communication and interaction require common shared ontologies. CS690L - Lecture 2
Ontology - Problems in Communication • People, organizations, software programs must communicate • Different needs and backgrounds imply different viewpoints, assumptions, jargon • This divergence is natural and valuable • But leads to problems in communication, interaction, and understanding • Explicit ontologies are crucial for • Communication • Education • Interoperation • Integration • Adaptive agents CS690L - Lecture 2
Ontology - Example • Researchers in molecular biology need to • share results and check consistency between their models, data, and reported models and data • The Riboweb project (Stanford, SMI) • Building an ontology for ribosomes, models, data, reports • Molecular structure, experimental data, tests, … • Encoding (by hand) relevant literature CS690L - Lecture 2
Ontology - Example • Doctors, clinics, hospitals, insurance companies, government agencies need to share information • Clinical guidelines, drug interactions, covered procedures, best practices • Several efforts are addressing aspects of this problem • UMLS (unified medical language system) • SNOMED (standard nomenclature for medicine) CS690L - Lecture 2
Ontology - Example • There are many workflow management systems available • In order to share information across them and support interoperation, we need to define an integrated ontology that covers • Processes, resources, products, services, organizations • Several groups are involved in such an effort • NIST, WfMC, PIF, TOVE CS690L - Lecture 2
Ontology - Example • Collaborative engineering projects need to communicate across discipline boundaries • Several projects (e.G., PACT, Boeing) have worked to build ontologies for the subdisciplines and span them • Goals include: • Automated notifications on design modifications • Cross-disciplinary simulation • Improved design process CS690L - Lecture 2
Ontology - Benefits • Explicit ontologies support • Shared understanding among people • Interoperability between tools • Systems engineering • Reusability • Declarative specification CS690L - Lecture 2
Semantic Web Language XML • Language for describing the structure of document content e.g., declare data to be a retail price, a sales tax, a book title, ... • Uniform method for describing and exchanging data using HTTP • Provides a “syntactic schema” <Publication URL = "ftp://db.stanford … xml.ps”> <Title> From Semistructured Data ... Language </Title> <Author> R. Goldman </Author> <Published> Proceedings of ... Databases </Published> <Location> Location of what? <City> Philadelphia </City> <State> Pennsylvania </State> </Location> <Date> <Month> June </Month> <Year> 1999 </Year> </Date> </Publication> When in June? CS690L - Lecture 2
Semantic Web Language XML Is Not Enough • Ontologies enable independently developed programs to exchange data: XML provides “syntactic schema” • Ontologies specify intended meaning in a computer interpretable form: XML provides no means of specifying intended meaning of tags “XML is like HTML, where you make up your own tags.” “But in XML, you can’t say what your tags mean.” CS690L - Lecture 2
W3C Semantic Web Activity • Semantic Web Activity (http://www.w3.org/2001/sw/) • “Established to serve a leadership role, in both the design of enabling specifications and the open, collaborative development of technologies that support the automation, integration and reuse of data across various applications.” • Successor to the W3C Metadata Activity • RDF Core Working Group (http://www.w3.org/2001/sw/RDFCore/) • Responsible for the Resource Description Framework (RDF) • Web Ontology Working Group (http://www.w3.org/2001/sw/WebOnt/) • Charter: Build upon the RDF Core work a language for defining structured web based ontologies which will provide richer integration and interoperability of data among descriptive communities • Developing Ontology Web Language (OWL) • Based on DAML+OIL, developed in DARPA’s Agent Markup Language program CS690L - Lecture 2
Open Issues • What can we do as individuals and as part of the semantic web community? • Everyone was frustrated by the "waiting around" for Semantic Web infrastructure to appear, • that creating "some" infrastructure was more important that resolving the remaining "expressivity vs. tractability" dilemmas (for example). • there is always risk solving the easy parts of problems first, because that can make it harder to solve the harder parts later. Nevertheless, the consensus was "forge ahead!" CS690L - Lecture 2
Open Issues • Do we need to standardize on foundational models first? • agree on minimalist semantics (expressivity) and a syntax in which to represent units of meaning, • leaving for distributed, incremental, and local development the problem of creating actual ontologies • that would be expressed, represented and communicated using the foundational model. CS690L - Lecture 2
Open Issues • Is the current Semantic Web standards development process adequate? • This addresses the dilemma posed by a general acknowledgement that the Semantic Web poses new challenges; • The current standards process may be the best that we know how to create, and it still may be inadequate - because, for instance, it deals with distributed semantics. • At worst, it needs field-testing and feedback from actual use. CS690L - Lecture 2
Open Issues • Do we need Semantic Web glossaries? ("pumpkins?") • Even if there was not consensus on the definitions, all agreed that Semantic Web glossaries would be a big help; • they would be something to disagree with, and catalyze alternative definitions for important concepts. • Do we need some ontology ontologies? • Everyone recognized the "ontology ontology" problem and that it's lack of resolution was an impediment to progress, and that "we are all part of the problem." • That is, it's hard to find out what ontologies exist, and whether they are worth using, etc. This is part, but not all, of the deep ontology re-use challenge. CS690L - Lecture 2
Open Issues • How do we deal with the diversity of languages and tools that are starting to emerge for semantic content. • Currently XML, XML schema, RDF(S), DAML+OIL, WebML, and various other tools are available for metadata storage, querying, etc. • It is clear that there is a need for unifying frameworks, toolkits, etc. • Do we need well-defined semantics in the metadata languages? • Many of the applications were using ontology languages like DAML, or extensions of RDF(S). • Consensus was that completing the RDFS standard, and moving to a web ontology standard that extended RDFS and XML Schema was important for these applications. CS690L - Lecture 2
Open Issues • Do we all believe that experimentation should continue? • expressivity vs. tractability • We have no proof that proposed Semantic Web standards and tools are useful or even work at all. • The chicken/egg problem: • Without semantic markup, there's not a lot of motivation for the industrial base to pay attention to the semantic web. • Without industry investment/support, the W3C and others have trouble developing standards and getting sources marked up. • Current government funding helps to jump start this level, but the semantic web community needs to figure out how to both publicize these efforts and increase the dissemination of this technology. CS690L - Lecture 2
Discussion • What will make the semantic web have a life of its own? • What are key ontologies that need to be created? • What are the “killer apps” for the semantic web? • Do you have ontologies you could contribute? CS690L - Lecture 2