370 likes | 469 Views
Ontology support. For the Semantic Web. The big picture. Diagram, page 9 h tml5 xml can be used as a syntactic model for RDF and DAML/OIL RDF, RDF Schema (with data modeling) – RDF takes object specifications and flattens them into triples
E N D
Ontology support For the Semantic Web
The big picture • Diagram, page 9 • html5 • xml can be used as a syntactic model for RDF and DAML/OIL • RDF, RDF Schema (with data modeling) – RDF takes object specifications and flattens them into triples • DAML/OIL – used to specify the details of UPML components • UPML – architectural description language for components, adapters, connection configurations
DAML & OIL • DAML examples, pages 69 to 77 • OIL examples, pages 99 • OIL constraints 101 to 103 • Intriguing diagram, page 113
UPML • Diagram of UPML’s role, page 144 • Key function: “component markup” • UPML diagram, page 147 – a PSM is a “problem solving method” • Protégé is a free editor for ontology-related languages, page 160 & 162
Another Big view of the semantic web • Diagram, page 173 • Intriguing comparison diagram, page 175 • Extra capabilities of ontologies over lower level specifications • Consistency • Filling in semantic details • Interoperability support • Validation and verification • Configuration support • Support for structured searches • Generalization/specialization meta information
Interesting twist on how databases should be built • Old way – page 266 • New way – page 268 • The smarter DB architecture, page 273 • What are we adding? • Used to be data, schema, then sql, then transaction manager, then apps, then UI • Now we are introducing more metadata? More schema? • Or is this a completely different kind of database? • Where data consists of assertions?
A “semantic portal” • Page 320 • Both humans and “agents” can access semantic portals • But how do humans interact with a semantic portal via a browser? • Comparison between ontologies and knowledge – page 322 • The idea of extensibility as a critical aspect of the semantic web • Not just new data, not just new metadata, but new inferences as well • Big picture diagram, page 333
Semantic Gadgets concept • Making smarts ubiquitous • The Internet of Things and Ambient Intelligence • For learning, mobile activities, using remote services • Mobile computing and mobile-based queries • Devices that can interact with our devices • Museum locations and user with sound device • Hand held devices and grocery store shopping and congnitively disabled
Semantic annotation concept • Diagram – page 406 • Detailed diagram – page 415 • Example – pages 417 and 418 • We see the use of parallel databases that hold metadata that is searchable • And metadata can be applied in a personalized way to provide specific results to specific users • See page 420……..
Task-achieving agents notion • Diagram, page 434 • Kinds of tasks • Automated planning • Computer-supported cooperative work • Multi-agent mixed-initiative planning • Workflow support • Example diagram, page 442 • This is a common way of viewing the new web • Smart agents replace browsers
A concrete component: SPARQL • Query language modeled after SQL • It can walk through semantic websites and across semantic websites • SPARQL thus creates new knowledge by creating inferences that can cross website boundaries
From - http://www.cambridgesemantics.com/2008/09/sparql-by-example/ • A SPARQL query comprises, in order: • Prefix declarations, for abbreviating URIs • Dataset definition, stating what RDF graph(s) are being queried • A result clause, identifying what information to return from the query • The query pattern, specifying what to query for in the underlying dataset • Query modifiers, slicing, ordering, and otherwise rearranging query results
What can sparql do? • It can extend an ontology by adding new inferences as assertions • Retrieve triples that describe something • Ask true or false questions based on assertions
Another view: The open semantic framework • Layered architecture • Modular software • It is part of a four component approach: • Software • Structure • Documentation • Methods
Goals • Leverage existing data and apps • Build and validate incrementally • Use open software, standards, protocols • Link data • Use RDF as a unifying data model • Address high level IT management issues • Assumptions and techniques • Use URIs to identify information • All data is equal – text, media, relational dbs
Big picture – from:http://openstructs.org/open-semantic-framework/overview
layers • Existing assets • Databases of all kinds • Web pages • Documents • Information Transformation (scones/irON) • Extraction of data and metadata • Scones – subject concept or named entities • Conversions – via irON (instance record Object Notation)
Layers continued • structWSF layer • The “workhorse” • Web services framework • Provides a common interface layer by which existing info assets can be mediated • Include CRUD, browse, search, export, import primitives • Supports sparql • Rights and permissions controls • Each structWSF instance has a unique Web address that allow easy use/reuse and reconfiguration
Layers continued • Semantic Components layer • Takes computed results generated via queries from one or more structWSF instances and presents data visually using “semantic components” • Components include • Filter • Tabular templates • Bar, pie, other charts • Relationship browser • Annotator
Layers continued • Ontologies layer • Content Management System layer (conStruct) • Thin • Endpoints • Portals • Collaborative environments • Media rich
Hmm… you can download it • http://techwiki.openstructs.org/index.php/Open_Semantic_Framework_Installer
Another view of ontologies:http://www.cems.uwe.ac.uk/amrc/seeds/ModellingSemanticWeb.htm
What is it? • “DBpediais a community effort to extract structured information from Wikipedia and to make this information available on the Web. DBpedia allows you to ask sophisticated queries against Wikipedia, and to link other data sets on the Web to Wikipedia data. We hope this will make it easier for the amazing amount of information in Wikipedia to be used in new and interesting ways, and that it might inspire new mechanisms for navigating, linking and improving the encyclopedia itself.”
Facts:http://dbpedia.org/About • The DBpedia knowledge base currently describes more than 3.64 million things • 416,000 persons, 526,000 places, 106,000 music albums, 60,000 films, 17,500 video games, 169,000 organisations, 183,000 species and 5,400 diseases • The DBpedia knowledge base allows you to ask quite surprising queries against Wikipedia, for instance “Give me all cities in New Jersey with more than 10,000 inhabitants” or “Give me all Italian musicians from the 18th century” • The DBpedia data set is interlinked with various other datasets on the Web.
A possible application of semantic web technology: citrus & more • Work with Brad Parks • The HLB disease caused by a bacteria • Spread by an insect called the “Asian citrus psyllid” • Attacks all citrus trees • Has infected 40% of trees in Florida, the largest orange producing state in the US • Has been found in Florida and Arizona, insect but not the bacteria in California • Has heavily wiped out citrus orchards in Brazil (largest orange producer in the world) and Mexico • It’s too late for Florida & since there is no treatment, tracking does little • But lots of pathogens and disease vectors can be tracked and modeled • Detectors in the field (DNA fingerprinting, organic chemical sensors, heat, imaging) • Volunteers on the ground who are connected
More… • Possible applications • Food born disease tracking • Infectious disease tracking • Other technology • Coordination of testers live in field • Application of models to mathematically similar situations
Citrus and more, continued • Information collection and aggregation • Integration of heterogeneous forms of information • Internet of things: sensors and people (sorry) • Ambient intelligence (sensors have onboard computers and cellular connectivity devices) • Automatic collection of data into multiple sites and searched automatically via software • Automatic delivery of information aggregation and analysis results • Automatic creation of dynamic models
Semantically accurateadvertising • A major commercial focus for some time now • Classic story of a person reading NY times about a body hidden in a suitcase and the Times threw out an add for Samsonite luggage • A similar story – there was an article about the shipment of a human head to someone and a UPS ad popped up • Another story of an anti De Beers article claiming that diamond mining is brutal and people are being programmed to want diamonds – and an ad for De Beers jewelry pops up.
Semantically inaccurate:The problem is words • In most of these cases it was the repeated use of some particular key word that the ad server was looking for that triggered the placement • Where does the semantic web fit in? – first, define the words • Articles could have proper tags saying what they are about, including urls that point to relevant ontologies, so the article would have been marked as being about a severed head, not about the shipping of packages • The introduction of ontology information when a proper noun or an acronym is used. • Deliberate placement of specific ontological references the writer knows would trigger a semantically accurate ad placement
More … • The insertion of assertions (triples) that pertain to the article • This can explain why certain combinations of words appear, instead of just relying on a statistical approach that probably this article is about • Assertions can be chained together to create new inferences that allow an ad server to find relevant ads that the article writer would never have imagined • Assertions can be chained across articles so that a common theme can be brought out and lead to an ad that might be heavily read and very corrected targeted • Ads can be targeted toward specific readers, as well • Currently, sites track the articles you read, the things you buy, the pages you spend more than a second or so looking at • But if information about us was properly structured, just imagine…
A note on the semester project • Please let me know what you want to do • Only four weeks left • Time to sign up for a brief, informal slot to let us see your project • Using the last three weeks of the class will spread them out • We will also use the time slot scheduled for the final exam, but having people do this earlier will make the “final exam” shorter • No paper, no slides, no formal presentation of any kind, just simple visuals…
Project, continued • What to bring in • Have something visual • If your project is “under the waterline”, please prepare a diagram or something simple to give us an idea of what you are demoing • A user scenario would be great • An architectural diagram of how the system is built • Be prepared to tell us • What interesting data management problem is being addressed • What’s good about this • What are its limitations • ** Can you identify any very long term or intractable challenges? **