Conversational Case Base Recommender Systems for Metadata Discovery

Conversational Case Base Recommender Systems for Metadata Discovery Mehmet S. Aktas, Marlon Pierce, Geoffrey Fox and David Leake Indiana University

Solid Earth Research Virtual Observatory Grid (SERVOGrid) • SERVOGrid is a NASA project to integrate historical, measured, and calculated earthquake data (GPS, Seismicity, Fault) with simulation codes. • SERVOGrid resources located at various institutions across the country. • # of resources, services and their usage frequency expected to grow quickly.

Characteristics of Computing for Solid Earth Science • Widely distributed datasets in various formats • GPS, Fault data, Seismic data sets, InSAR satellite data • Many available in state of art tar files that can be FTP’d • Distributed models and expertise • Lots of codes with different regions of validity, ranging from cellular automata to finite element to data mining applications (HMM) • Some codes also have export or IP restrictions • Other codes are highly specialized to their deployment environments. • Decomposable problems requiring interoperability for linking full models • The fidelity of your fault modeling can vary considerably • Link codes (through data) to support multiple scales

SERVOGrid Applications • Codes range from simple “rough estimate” codes to parallel, high performance applications. • Disloc: handles multiple arbitrarily dipping dislocations (faults) in an elastic half-space. • Simplex: inverts surface geodetic displacements for fault parameters using simulated annealing downhill residual minimization. • GeoFEST: Three-dimensional viscoelastic finite element model for calculating nodal displacements and tractions. Allows for realistic fault geometry and characteristics, material properties, and body forces. • VirtualCalifornia: Program to simulate interactions between vertical strike-slip faults using an elastic layer over a viscoelastic half-space • RDAHMM: Time series analysis program based on Hidden Markov Modeling. Produces feature vectors and probabilities for transitioning from one class to another. • Preprocessors, mesh generators: AKIRA suite • Visualization tools: RIVA, GMT, IDL

Motivation • Most fundamental challenge is just making these codes useable for other researchers. • And hooking these codes to data sources • First step is to express resources with descriptive metadata • Then explore intelligent retrieval mechanisms to make these resources available

SERVOGrid Ontology Overview • We have a collection of codes, visualization tools, computing resources, and data sets that we want to combine in an ontology. • Ontology instances can then be built to describe specific resources. • After we have built instances, we can pose queries on the data to retrieve values. • Values may be structured, so we can do “stepped” queries. • We thus need to start by grouping together related resources.

An Instance for Disloc code <rdf:RDF xmlns:rdf='http://www.w3c.org/1999/02/22-rdf-syntax-ns#' xmlns:rdfs='http://www.w3c.org/2000/01/rdf-schema#' xmlns:servo='http://www.servogrid.org/schemas/SERVOGridOntology#' xmlns:dc="http://purl.org/dc/elements/1.0/"> <rdf:Description rdf:ID="Disloc"> <rdf:type rdf:resource="http://www.servogrid.org/schemas/servoOntology#ApplicationCode"/> <dc:creator>A. Donnellan</dc:creator> <servo:installedOn rdf:resource="http://www.servogrid.org/instances/ComputeResources/Grids"/> <servo:takesInputData rdf:resource="http://www.servogrid.org/instances/data/Faults"/> <servo:createsOuputData rdf:resource="http://www.servogrid.org/instances/data/SurfaceStress"/> </rdf:Description> </rdf:RDF>

From SW Representation to Case Base Reasoning (CBR) • Developing new tools, applications and architectures on top of the Semantic Web is the real challenge. • Can we ensure consistency and correctness in the presentation of information?? • AI techniques could be considered as basis for a resource recommender system. • CBR is most suitable AI technique for SERVOGrid domain.

What is Case-Based Reasoning?(CBR in a Nutshell) • CBR is reasoning by remembering • In CBR, recommendations are made by doing reasoning from current set of cases • Classification CBR • when a similar problem description is entered most similar cases are suggested (by comparing and contrasting problem description with current set of cases) to the user as results

Conversational CBR (CCBR)(CCBR in a Nutshell) • CCBR is a type of CBR that relies on question-answer sessions to recommend most similar cases. • User interacts with the system to fill in the gaps to retrieve right cases • System responds with ranked cases and questions at each step • Question-answer-ranking cycle continues until success or failure • success: if user finds an answer to his query • failure: if no satisfactory case is found

What is a Case?(CCBR Case in SERVOGrid) • A case is composed of: • problem description: metadata concerning desired characteristics of a SERVOGrid resource, e.g., RDF triples describing a resource • solution: pointer to a resource described by metadata in problem description • A Casebase is library of cases generated from file store of RDF files each representing a case.

CCBR Case with RDF Representation CCBR CASE RDF Triple Solution Problem RDF Triple RDF Triple = (Subject, Predicate, Object)

CCBR Recommender System • Ranking of the cases • Cases will be ranked based on their consistent triple numbers. • If the case has a matching triple, it will have higher ranking. • If the case does not have the entered triple, its ranking won’t change, unless user wants the cases which don’t have this triple. • Ranking of the questions • Ranking can be based on (property, property value) appearance # in the triples stored in the case base. • System must recommend good starting points for user specification of servoObject class properties.

CCBR Recommender System CCBR CASEBASE A Case from CASEBASE Query Case A B Feature 1 Feature 2 Feature 3 Feature 4 Feature 1 Feature 2 Feature 5 Case IF ((A.Feature1.Solution = B.Feature1.Solution) & (A.Feature2.Solution = B.Feature2.Solution)) THEN Consistency # = 2 Case = <Problem, Solution>

Recap: SERVOGrid Case Base Recommender System • goal: locating resources in a large scale environment (SERVOGrid project) • approach: • SERVOGrid ontology instances (metadata) to describe resources • Recommender system to aid metadata discovery • Conversational CBR with SW markup languages providing standard form for case representation

More Information • SERVOGrid/QuakeSim: • http://quakesim.jpl.nasa.gov/ • SERVOGrid Recommender Systems project: • http://tambora.ucs.indiana.edu/~maktas/servo/project.html • SERVOGrid Recommender Systems demo: • http://ripvanwinkle.ucs.indiana.edu:4780/cbr/selection.jsp • Publications: • http://grids.ucs.indiana.edu/ptliupages/publications/

Questions/Comments • Any questions and/or comments? • Thanks!

Conversational Case Base Recommender Systems for Metadata Discovery

Conversational Case Base Recommender Systems for Metadata Discovery

Presentation Transcript

Recommender systems

Recommender Systems

Recommender Systems

Recommender Systems

Recommender Systems

Recommender systems

Recommender Systems

Recommender Systems

Recommender Systems

Recommender Systems

Discovery Metadata

Recommender Systems

Recommender systems

Recommender Systems

Recommender Systems

Conversational Case Base Recommender Systems for Metadata Discovery

Discovery Metadata

Recommender Systems