350 likes | 450 Views
CS6999 SWT Lecture 1 Introduction to the Semantic Web. Bruce Spencer NRC-IIT Fredericton Sept 12, 2002. National Research Council. Research Institutes and Facilities across Canada 17 research institutes 4 innovation centres 3,500 employees; 1,000 guest workers
E N D
CS6999 SWTLecture 1Introduction to the Semantic Web Bruce Spencer NRC-IIT Fredericton Sept 12, 2002
National Research Council Research Institutes and Facilities across Canada 17 research institutes 4 innovation centres 3,500 employees; 1,000 guest workers National science facilities S&T information for industry and scientific community CISTI: Candian Inst. for Science and Tech Information Network of technology advisors supporting SME IRAP: Industrial Reseach Assistanceship Program CS 6999 SW Semantic Web Techniques
Institute for Information Technology • There are two aspects to IIT • A mature research organization of ~80 people in Ottawa • New labs being developed in four cities in New Brunswick and Nova Scotia involving ~60 new people • The whole organization is evolving to accommodate our new distributed nature CS 6999 SW Semantic Web Techniques
NRC’s plans for New Brunswick • What? • NRC is building an e-business research team in New Brunswick • E-business includes e-learning, e-government, e-health. Using information and communication technology to help us to educate, govern and take care of ourselves, to create wealth. • New Brunswick and Canadian companies already have strengths in all three areas • NB’s communications infrastructure and interested telco • Bilingual workforce CS 6999 SW Semantic Web Techniques
NRC’s plans for New Brunswick • NRC will act locally, and think nationally and globally • Will work with new Brunswick community to develop clusters in e-business • This is also NRC’s national lab in e-business • NRC will build international links • Where? • Main group (40 staff) in Fredericton, at UNBF • Satellite in Saint John (6 staff), at E-Comm Centre, UNBSJ • Satellite in Moncton (6 staff), at U. de Moncton CS 6999 SW Semantic Web Techniques
Bruce • MMath 83, BNR 83-86, Waterloo PhD 86-90, UNB prof 90-01, NRC 01-now • Automated reasoning • data structures in theorem proving • eliminate redundant searching • smallest proofs • deductive databases • Java in curriculum since 1997 CS 6999 SW Semantic Web Techniques
Overview and Course Mindmap • Increasing demand for formalized knowledge on the Web: AI’s chance! • XML- & RDF-based markup languages provide a 'universal' storage/interchange format for such Web-distributed knowledge representation • Course introduces knowledge markup & resource semantics: we show how to marry AI representations (e.g., logics and frames) with XML & RDF [incl. RDF Schema] Namespaces CSS DTDs XSLT DAML Stylesheets Agents Transformations Ontobroker XQL XML HornML Rules Queries XQuery RuleML Mindmap XML-QL SHOE RDF[S] Frames Acquisition TopicMaps Protégé CS 6999 SW Semantic Web Techniques
The Semantic Web Activityof the W3C • “The Semantic Web is a vision: the idea of having • data on the Web defined and linked in a way that • it can be used by machines not just for display purposes, • but for • automation, • integration and • reuse of data across various applications.” • (http://www.w3.org/2001/sw/Activity) Semantic Web CS 6999 SW Semantic Web Techniques
What your computer sees in HTML <b>Joe’s Computer Store </b> <br> 365 Yearly Drive Presentation information What your computer sees in XML <location> <name>Joe’s Computer Store </name> <address> 365 Yearly Drive </address> </location> Content description (ambiguous) CS 6999 SW Semantic Web Techniques
What a computer could understand <mail:address xmlns:mail=“http://www.canadapost.ca”> <mail:name>Joe’s Computer Store </mail:name> <mail:street> 365 Yearly Drive </mail:street> </mail:address> • www.canadapost.ca could define address, name, street, … • Search engines could then identify mail addresses • Consider shopbots being able to find • price, quantity, feature, model number, supplier, serial number, acquisition date • Assumes that namespaces will be used consistently CS 6999 SW Semantic Web Techniques
Semantic Web • Semantics = meaning • Good Idea: Dictionary • Create a dictionary of terms • Put it on the web • Mark up web pages so that terms are linked to these dictionary-entries • This allow more precise matching • Better idea: Thesaurus • has hierarchies of terms • shades of meaning • Best idea: Ontology • hierarchy of terms and logic conditions CS 6999 SW Semantic Web Techniques
Semantic Web • An agent-enabled resource • “information in machine-readable form, creating a revolution in new applications, environments and B2B commerce” • W3C Activity launched Feb 9, 2001 • DAML: DARPA Agent Markup Language • US Gov funding to define languages, tools • 16 project teams • OIL is Ontology Inference Layer • DAML+OIL is joint DARPA-EU • Knowledge Representation is a natural choice CS 6999 SW Semantic Web Techniques
Smoked Salmon • SmokedSalmon is the intersection of Smoked and Salmon CS 6999 SW Semantic Web Techniques
Smoked Salmon • Gravalax is the intersection of Cured and Salmon, but not Smoked Gravalax • SmokedSalmon is the intersection of Smoked and Salmon CS 6999 SW Semantic Web Techniques
Smoked Salmon Lox Gravalax • Lox is Smoked, Cured Salmon • SmokedSalmon is the intersection of Smoked and Salmon • Gravalax is the intersection of Cured and Salmon, but not Smoked CS 6999 SW Semantic Web Techniques
Gravalax The Semantic Web is about having the Internet use common sense. • A search for keywords Salmon and Cured should return pages that mention Gravalax, even if they don’t mention Salmon and Cured • A search for Salmon and Smoked will return smoked salmon, should also return Lox, but not Gravalax Smoked Salmon Lox CS 6999 SW Semantic Web Techniques
Gravalax Smoked Salmon Lox CS 6999 SW Semantic Web Techniques
Tim Berners- Lee’s Semantic Web CS 6999 SW Semantic Web Techniques
RDF Resource Description Framework • Beginning of Knowledge Representation influence on Web • Akin to Frames, Entity/Relationship diagrams, or Object/Attribute/Value triples CS 6999 SW Semantic Web Techniques
RDF Example <rdf:ProductSpecs about= “http://www.lemoncomputers.ca/model_2300”> <specs:colour>yellow</specs:colour> <specs:size>medium</specs:size> </rdf:ProductSpecs> model_2300 size colour medium yellow CS 6999 SW Semantic Web Techniques
is_a lemon_palmtop_20000 RDF Class Hierarchy • All lemon laptops get packed in cardboard boxes • Allows one to customize existing taxonomies • Example: palmtop computers still get packed in boxes model_2300 size colour medium yellow CS 6999 SW Semantic Web Techniques
Tim Berners- Lee’s Semantic Web CS 6999 SW Semantic Web Techniques
Ontology Web Language: W3C • Previously known as DAML+OIL • US: DARPA Agent Markup Language • EU: Ontology Interchange Layer (Language) • Composed of a hierarchy with additional conditions • Based on Description logic, limited expressivenss • Reasoning procedures are well-behaved • Just enough power CS 6999 SW Semantic Web Techniques
Identifying Resources • URL/URI • Uniform resource locator / identifier • Information sources, goods and services • financial instruments • money, options, investments, stocks, etc. • “Where do you want to go today?” • becomes “What do you want to find?” CS 6999 SW Semantic Web Techniques
Ontology • Branch of philosophy dealing with the theory of being • Tarski’s assumption: • individuals, relationships and functions • “A common vocabulary and agreed-upon meanings to describe a subject domain” • What real-world objects do my tags refer to? • How are these objects related? • Communication requires shared terms • others can join in CS 6999 SW Semantic Web Techniques
Ontology Layer • Widens interoperability and interconversion • knowledge representation • More meta-information • Which attributes are transitive, symmetric • Which relations between individuals are 1-1, 1-many, many-many • Communities exist • DL, OIL, SHOE (Hendler) • New W3C working group CS 6999 SW Semantic Web Techniques
Transitive, Subrole example • One wants to ask about modes of transportation from Sydney to Fredericton • “connected by Acadian Lines bus” is a role in a Nova Scotia taxonomy • “connected by SMT bus” from New Brunswick • Both are subroles of “connected” • “connected” is transitive • Note that ontologies can be combined at runtime CS 6999 SW Semantic Web Techniques
Connected by Acadian Lines Amherst Amherst Connected by SMT Lines Connected by Acadian Lines Truro Sussex Connected by SMT Lines Sydney Fredericton Combining Rich Ontologies • Only these facts are explicit • in separate ontologies • “Connected by bus” • is superset • is symmetric and transitive • Route from Sydney to Fredericton is inferred CS 6999 SW Semantic Web Techniques
Tim Berners- Lee’s Semantic Web CS 6999 SW Semantic Web Techniques
Logic Layer • Clausal logic encoded in XML • RuleML, IBM CommonRules • Special cases of first-order logic • Horn Clauses for if-then type reasoning and integrity constraints • Standard inference rules based on Resolution • Various implementations: SQL, KIF, SLD (Prolog), XSB • J-DREW reasoning tools in Java. • Modus operandi: build tractable reasoning systems • trade away expressiveness, gain efficiency CS 6999 SW Semantic Web Techniques
Logic Architecture Example • Contracting parties integrate e-businesses via rules Seller E-Storefront Buyer’s ShopBot Business Rules Business Rules Contract Rules Interchange OPS5 Prolog CS 6999 SW Semantic Web Techniques
Negotiation via rules usualPrice: price(per-unit, ?PO, $60) purchaseOrder(?PO, supplierCo, ?AnyBuyer) shippingDate(?PO, ?D) (?D 24April2001). volumeDiscountPrice: price(per-unit, ?PO, $55) purchaseOrder(?PO, supplierCo, ?AnyBuyer) quantityOrdered(?PO, ?Q) (?Q 1000) shippingDate(?PO, ?D) (?D 24April2001). overrides(volumeDiscount, usualPrice). CS 6999 SW Semantic Web Techniques
Hot Research Topics: • Tools to create ontologies • Ontolingua • Protégé-2000 (Stanford) • OILED • … • Tools to learn ontologies from a large corpus such as corporate data • Merging / aligning two different ontologies from different sources on the same topic • Searching cum reasoning tools • SHOE CS 6999 SW Semantic Web Techniques
Eventual Goal of these Efforts • Agents locate goods, services • use ontologies • unambiguous • business rules • expressive language but reasoning tractable • combine from various sources • Gives rise to need of trust, privacy and security • e.g. semantic web project to determine eligibility of patients for a clinical trial CS 6999 SW Semantic Web Techniques