270 likes | 283 Views
This paper explores the use of Situation Theory and Concept Maps to generate simple and comprehensible Concept Maps from RDF data, improving analyst productivity and aiding in data analysis.
E N D
Comprehension of RDF Data Using Situation Theory and Concept Maps Jakub J. Moskal Mieczyslaw “MITCH” M. KOKAR BRIAN E. ULICNY November 19, 2014
Outline Lots of RDF data. Querying with SPARQL produces complicated RDF graphs Objective: Generate “simple” Concept Maps Situation Theory (Barwise, Perry, Devlin) STO: Situation Theory Ontology Process outline and processing steps Examples Conclusions
Abundance of Data • Analysts are required to sift through tremendously large amounts of data • Keyword-based queries yield poor results • Structured data is needed • The number of RDF data sets is growing rapidly • Even though RDF data is structured, it can be very difficult to analyze Source: http://lod-cloud.net/
Linked Open Data cloud diagramAs of 08/30/2014 Linking Open Data cloud diagram 2014, by Max Schmachtenberg, Christian Bizer, AnjaJentzsch and Richard Cyganiak. http://lod-cloud.net/
ExampleQuery PREFIXdbpedia-owl: <http://dbpedia.org/ontology/> DESCRIBE?resource WHERE { ?resourcedbpedia-owl:abstract?abstract. FILTERlangMatches(lang(?abstract), "EN" ). FILTER REGEX(str(?abstract), "Richard H. Barter") } LIMIT 10 • Query: • What were the circumstances of Richard H. Barter’s death? • RDF Data: • SPARQL Endpoint: http://dbpedia.org/sparql • SPARQL query:
Our Approach • Objective: • Given a query, transform the input RDF graph to a Concept Map that: • Provides answer to the query • Contains facts that are relevant to the query (context) • Is more abstract than the original RDF graph(easier to comprehend) • Approach: • Use key aspects of Situation Theory of Barwise and Perry (extended and formalized by Devlin) • Map the problem to this theory and implement algorithms for constructing concept maps based on such a framework
Expected Benefits • Increased analyst productivity • Easier comprehension • Tailored visualization • Explanatory facts • Improved quality of analyst products • Fewer false alarms • More detections of relevant events • Enriched fact base via inference • Augmented with situation types and their instances • Integration with other analyst tools • Export to standard formats
A Bit of Situation Theory - Infon - S “supports” Infon - Situation Type - Abstract Situation - Definitional Query - Inferring situations and their types - “Relevance” – via entailment
Situation Theory – Relevance Reasoning Relevant entities with respect to a given query Q are those entities that are necessary for proving that a specific set of facts SQsupported by a situation satisfies Q.
Situation Theory – Why? It grounds meaning in the world, rather than in the language (unlike in FrameNets) It allows specifying views of the world (situations) that are globally inconsistent, but locally consistent Situations are first-class citizens – they have their own relations and attributes Meaning of a declarative sentence is a relation between utterances and described situations
Representing Queries(in terms of ElementaryInfon and Situation Type) • “Did an insurgent visit a weapons cache?” • Expressible in pure OWL: • InsurgentWeaponsCacheSituation≡ Situation and (supportedInfonsome (ElementaryInfonand (anchor1 some Insurgent) and (anchor2 some WeaponsCache) and (relation value visit))) • “Which insurgents spied on a relative?” • Not expressible in pure OWL, requires use of variables • Rules are necessary, for instance: • Situation(s) ∧ ElementaryInfon(i) ∧ Object(a1) ∧ Object(a2) ∧ Relation(spiedOn) ∧ supportedInfon(s, i) ∧ anchor1(i, a1) ∧ anchor2(i, a2) ∧ relation(i, spiedOn) ∧ Insurgent(a1) ∧ Person(a2) ∧ relative(anchor1, anchor2) → RelativeSpySituation(s)
A Running Example (based on SynCOIN) • Query: • Which known insurgents are connected to people who have been to a weapons cache? • WCSit≡ Situation and (supportedInfonsome (ElementaryInfonand (anchor1 some Insurgent) and (anchor2 some (Person and hasBeeonTo some WeaponsCache))) and (relation value isConnectedTo))) • Initial facts:
(1) Domain Inference • Infer implicit facts about the domain • If necessary, add additional axioms to the dataset • We added a few axioms to SynCOIN:
(2) Situation Reasoning • Analyze situation type definitions (both in OWL and rules) • Extract relevant relations used in definitions (visitand spiedOn in previous examples). Then extract relevant individuals. • For each relation rel that is part of a situation type: • For each pair of individuals a1 and a2 that are associated with each other by the property rel: • Assert that there is an individual s of RDF type sto:Situation • Assert that there is an individual i of RDF type sto:ElementaryInfon, supported by situation s • Assert the following facts: (i anchor1 a1), (i anchor2 a2) and (i relation rel)
Example – cont. Initial Graph: Current Answer:
(3) Context Derivation • Derive the context for the answer • Find relevant facts: all individuals and relations that are relevant to the situation that represents the answer to the query • Derivation based on domain-independent rules, which backtrack OWL inference • Currently: Property chain, sub-property, transitive property • Example derivation rule for transitive property: • For a situation s, and a query q, if s satisfies the query: • For every fact (i1rel i2) relevant to s and an individual i3, if relis a transitive property and if (i1rel i3) and (i3rel i2) are facts asserted in the knowledge base: • Add (i1rel i3) and (i3rel i2) as facts relevant to s.
Example – cont. Previous Step: Current Answer
(4) Simplification • Context derivation is likely to produce a lot of “noise” • We need to remove facts that are relevant to a situation, but that are not necessary to comprehend the graph • Simplification based on domain-independent rules • Example simplification rule for sub-property relation between relevant relations: • For a situation s, and a query q, if s satisfies the query: • For every relation r1 and r2relevant to s, if r1 is a sub-property of r2: • For every two facts (i1 r1 i2) and (i1 r2 i2) that are both relevant to s: • Remove (i1 r2 i2) from the context of s.
Example – cont. Previous step: Final answer:
Conclusions • Objective: simplify answers to queries against RDF data • Approach: use Situation Theory (Barwise, Perry, Devlin) • Approximate Situation Theory formalization by using STO: Situation Theory Ontology, OWL and Rules • Queries represented by STO:ElementaryInfon and STO:Situation • Used OWL axioms to enhance reasoning about the domain • Developed domain-agnostic rules for inferring relevant situations, situation types, relations and individuals in situations • Developed context derivation rules • Developed context simplification rules • Developed a prototype and showed (on examples) that it works • BaseVISor was used for inference • To make it practical, “meta-reasoning” was needed.
Future Work • More domain-independent inference rules needed • Clustering • Inference-driven generalization • Machine Learning • Feedback collected from GUI • Concept/Link removal (affects transformation rules) • Graphical arrangement (affects clustering) • Scalability • Very large scale graph databases • Integration with data analytics • Evaluate with analysts!