390 likes | 417 Views
Explore the theoretical foundations and application areas of the Semantic Web compared to today's Web, with implications for LogicBlox and a focus on standardized metadata for climate datasets. Discover how Semantic Web technologies can enhance data management in scientific communities.
E N D
Foundations of the Semantic Web Rocky Dunlap College of Computing Georgia Tech Advisors: Spencer Rugaber Leo Mark
Things we can talk about • Earth System Curator • Including a short demo • Semantic Web... • ...versus today’s Web • ...theoretical foundations • ...application areas • Implications for LogicBlox • How could SW technologies be useful for LogicBlox?
The short story on Curator • Problem statement • Model-generated datasets in the climate community lack metadata that is required for understanding and analyzing the dataset • Premise • The descriptors used for comprehensively specifying a model configuration are also needed for a scientifically useful description of the model output data • Solution • Develop a standardized metadata formalism for describing climate datasets that is based on the model configuration used to generate the dataset
The Google Way 10 lines of metadata Run script: 1500+ lines plus input files • Keyword search... • Good for expert users who are familiar with the datasets • But... • Many details of what went into the dataset left out • Not machine-readable (e.g., by analysis tools) • Limited standardization • Not semantic or conceptual in nature
NASA GEOS-5 Model http://gmao.gsfc.nasa.gov/systems/geos5/
Demo http://cdp.ucar.edu:28080/query/queryESC.htm
Beyond Curator • Curator is a representative project • E-science communities are emerging in many areas of science • Prediction • Scientific data management -- an emerging area that will continue to grow
Today’s Web • Primarily intended for human consumption • Lots of content, but very little about what the content means • One relationship: the hyperlink • No semantics really • HTML is almost completely about presentation
HTML <html> <body> <h1>Rocky’s Pizzeria</h1> <h2>Where you make your own pizza pie!</h2> <b>Crust options:</b> <ul> <li>Deep Pan</li> <li>Thin and Crispy</li> </ul> <br/> <b>Toppings:</b> <ul> <li>Mozzarella</li> <li>Anchovies</li> ....
(Non-semantic) Hyperlinks <a href=“http://www.logicblox.com/”>Home Page</a> <html> </html> ? <html> </html> ? <html> </html> <html> </html> <html> </html> <html> </html> <html> </html> <html> </html> <html> </html>
The Semantic Web • Not a new web • A layer on top of the existing web • Web-based content should be annotated with explicit semantics • Intended primarily for machine-consumption, i.e., “intelligent agents” • Markup for describing meaning of content, not just presentation
Giving meaning to content <restaurants> <restaurant> <name>Rocky’s Pizzeria</name> <description>Where you make your own pizza pie!</description> <phone>404-123-4567</phone> <address>123 Peachtree Street</address> <menu choice=“crust”> <option>Deep Pan</option> <option>Thin and Crispy</option> </menu> <menu choice=“toppings”> <option>Mozzarella</option> <option>Anchovies</option> </menu> <restaurant> </restaurants>
Semantic Links hasPhotoAlbum hasReview <html> </html> hasNewVersion
SW Application Areas • Semantic interoperability and data integration • health care, drug industry, e-science • Semantic search • Can we outsmart Google? • Specific areas: • Geographic Information Systems • B2B Mediation • Legal • Automobile diagnostics and repair http://www.w3.org/2001/sw/sweo/public/UseCases/
SW Technology Stack Technology What it provides Rules Advanced reasoning RDF(S)/OWL Conceptual descriptions XML, XML Schema Structured data transfer HTTP, URIs Hypertexttransfer Internet Reliable transport
SW Ontology Languages • Resource Description Framework (RDF)1 • Provides the ability to make statements (propositions) about resources on the web • Web Ontology Language (OWL)2 • More expressive features than RDF • Strong theoretical basis on Description Logics
RDF Statements “The Curator meeting is at GFDL.” Curator meeting hasLocation GFDL subject predicate object
RDF Statements “The Curator meeting is Oct 18-19.” Curator meeting hasLocation resource GFDL starts ends “18 Oct 2007” literal “19 Oct 2007”
RDF Statements Balaji “Balaji works at GFDL.” worksAt Curator meeting hasLocation GFDL starts ends “18 Oct 2007” “19 Oct 2007”
RDF XML Representation <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:esc="http://www.earthsystemcurator.org"> <rdf:Description rdf:about=“http://....#OctCuratorMeeting"> <esc:hasLocation rdf:resource=“http://....#GFDL”/> <esc:starts>18 Oct 2007</esc:starts> <esc:ends>19 Oct 2007</esc:ends> </rdf:Description> <rdf:Description rdf:about=“http://....#Balaji"> <esc:worksAt rdf:resource=“http://....#GFDL”/> </rdf:Description> </rdf:RDF>
RDF Schema • Define a domain specific data model for RDF • Includes classes and properties (along with subclasses and subproperties) • Properties are first class (they are not defined as part of a particular class)
RDF Schema Properties Classes hasLocation domain: Event range: Place starts domain: Event range: date ends domain: Event range: date worksAt domain: Person range: Place Event Flight Meeting Person Place
OWL (Web Ontology Language) • Motivations • Expressive ontology language • Precise semantics • Understanding of formal properties such as decidability and complexity of inferencing • Ease of use for a wide audience • Influences • DAML+OIL (existing language) • Description Logics • Frames paradigm
Description Logics • “A formal language for representing knowledge and reasoning about it.”3 • “Description” • Primary facilities are concept and role descriptions • “Logics” • Equipped with a FOL-based formal semantics • Inference procedures reveal implied knowledge
DL Knowledge Base • TBox • Contains the terminology, i.e., the vocabulary of an application domain, i.e., concepts and roles • Think database schema • ABox • Contains the assertions about named individuals • Think database data
In the TBox • Two symbols • atomic concepts (A, B) • atomic roles (R) • Complex descriptions through • concept constructors (C, D) • role constructors • Concept and role constructors determine: • The expressiveness of the DL • Decidability/complexity of reasoning
DL Language AL Grammar for concept construction C A | (atomic concept) | (universal concept) | (bottom concept) A | (atomic negation) C D | (intersection) R.C | (value restriction) R. (limited existential quant.) A E
DL Language AL Atomic concepts:Food, Pizza, CheeseTopping Atomic role: hasTopping “Non-pizza foods” Food Pizza “Pizza with only cheese toppings” Pizza hasTopping.CheeseTopping “Pizza with some topping” Pizza hasTopping. A E
DL Concept/Role Constructors NameSyntaxSymbol Union C D U Existential quant. R.C E Unqual. cardinality restriction ≥ n R, ≤ n R, = n R N Qual. cardinality restriction ≥ n R.C, ≤ n R.C, = n R.C Q Role hierarchy R S H Inverse properties R- I Functional properties F Nominals (enumeration) O E
FOL Representations Food Pizza Food(x) Pizza(x) Pizza hasTopping.CheeseTopping A Pizza(x) y.hasTopping(x,y) CheeseTopping(y) Pizza(x) y.hasTopping(x,y) CheeseTopping(y) A Pizza hasTopping. E Pizza(x) y.hasTopping(x,y) E
In the ABox • Assertions about individuals Pizza(p1) Pizza(p2) CheeseTopping(mozzarella) CheeseTopping(goatCheese) hasTopping(p1, mozzerella) hasTopping(p1, goatCheese) hasTopping(p2, goatCheese)
Reasoning on the KB • TBox • Concept satisfiability • Are there possible instances? • Concept subsumption • Is concept C almost more general than D? • Concept equivalence • ABox • Instance checking • Is individual X an instance of concept C? • Consistency checking • Is individual X a possible model of the TBox?
From DL to OWL • OWL has three “species” or dialects • OWL-DL equivalent to DL SHOIN(D) • OWL-Lite equivalent to DL SHIF(D) • OWL-Full -- messy... • Must adhere to constraints of the Web • Standard XML syntax • URIs for identifying concepts, roles, and individuals • Concrete datatypes based on XML Schema • Protégé Demo4
Adding Rules to OWL • Semantic Web Rule Language (SWRL)5 hasParent(?x,?y) hasBrother(?y,?z) hasUncle(?x,?z) Artist(?x) artistStyle(?x, ?y) Style(?y) creator(?z,?x) style/period(?z, ?y)
Semantic Web Apps • DBPedia • Convert Wikipedia articles into RDF • Freebase • Wikipedia + semantic links • Twine • Social networking + semantic web • GeoNames • Ontology of geospatial semantic information • WordNet Ontology • RDF/OWL representation of WordNet
Discussion • Implications for LogicBlox?
References [1] Manola, et. al. RDF Primer. http://www.w3.org/TR/rdf-primer/ [2] McGuiness, et al. OWL Web Ontology Language Overview. http://www.w3.org/TR/owl-features/ [3] Baader, et. al. The Description Logic Handbook, 2007. [4] Horridge, at. al. A Practical Guide To Building OWL Ontologies Using the Protégé-OWL Plugin and CO-ODE Tools, Edition 1.0., 2004. http://www.co-ode.org/resources/tutorials/ProtegeOWLTutorial.pdf [5] Horrocks, et. al. SWRL: A Semantic Web Rule Language Combining OWL and RuleML. http://www.w3.org/Submission/SWRL/
E A