210 likes | 346 Views
Conversational Case Base Recommender Systems for Metadata Discovery. Mehmet S. Aktas, Marlon Pierce, Geoffrey Fox and David Leake Indiana University. S olid E arth R esearch V irtual O bservatory Grid ( SERVOGrid ).
E N D
Conversational Case Base Recommender Systems for Metadata Discovery Mehmet S. Aktas, Marlon Pierce, Geoffrey Fox and David Leake Indiana University
Solid Earth Research Virtual Observatory Grid (SERVOGrid) • SERVOGrid is a NASA project to integrate historical, measured, and calculated earthquake data (GPS, Seismicity, Fault) with simulation codes. • SERVOGrid resources located at various institutions across the country. • # of resources, services and their usage frequency expected to grow quickly.
Characteristics of Computing for Solid Earth Science • Widely distributed datasets in various formats • GPS, Fault data, Seismic data sets, InSAR satellite data • Many available in state of art tar files that can be FTP’d • Distributed models and expertise • Lots of codes with different regions of validity, ranging from cellular automata to finite element to data mining applications (HMM) • Some codes also have export or IP restrictions • Other codes are highly specialized to their deployment environments. • Decomposable problems requiring interoperability for linking full models • The fidelity of your fault modeling can vary considerably • Link codes (through data) to support multiple scales
SERVOGrid Applications • Codes range from simple “rough estimate” codes to parallel, high performance applications. • Disloc: handles multiple arbitrarily dipping dislocations (faults) in an elastic half-space. • Simplex: inverts surface geodetic displacements for fault parameters using simulated annealing downhill residual minimization. • GeoFEST: Three-dimensional viscoelastic finite element model for calculating nodal displacements and tractions. Allows for realistic fault geometry and characteristics, material properties, and body forces. • VirtualCalifornia: Program to simulate interactions between vertical strike-slip faults using an elastic layer over a viscoelastic half-space • RDAHMM: Time series analysis program based on Hidden Markov Modeling. Produces feature vectors and probabilities for transitioning from one class to another. • Preprocessors, mesh generators: AKIRA suite • Visualization tools: RIVA, GMT, IDL
Motivation • Most fundamental challenge is just making these codes useable for other researchers. • And hooking these codes to data sources • First step is to express resources with descriptive metadata • Then explore intelligent retrieval mechanisms to make these resources available
SERVOGrid Ontology Overview • We have a collection of codes, visualization tools, computing resources, and data sets that we want to combine in an ontology. • Ontology instances can then be built to describe specific resources. • After we have built instances, we can pose queries on the data to retrieve values. • Values may be structured, so we can do “stepped” queries. • We thus need to start by grouping together related resources.
An Instance for Disloc code <rdf:RDF xmlns:rdf='http://www.w3c.org/1999/02/22-rdf-syntax-ns#' xmlns:rdfs='http://www.w3c.org/2000/01/rdf-schema#' xmlns:servo='http://www.servogrid.org/schemas/SERVOGridOntology#' xmlns:dc="http://purl.org/dc/elements/1.0/"> <rdf:Description rdf:ID="Disloc"> <rdf:type rdf:resource="http://www.servogrid.org/schemas/servoOntology#ApplicationCode"/> <dc:creator>A. Donnellan</dc:creator> <servo:installedOn rdf:resource="http://www.servogrid.org/instances/ComputeResources/Grids"/> <servo:takesInputData rdf:resource="http://www.servogrid.org/instances/data/Faults"/> <servo:createsOuputData rdf:resource="http://www.servogrid.org/instances/data/SurfaceStress"/> </rdf:Description> </rdf:RDF>
From SW Representation to Case Base Reasoning (CBR) • Developing new tools, applications and architectures on top of the Semantic Web is the real challenge. • Can we ensure consistency and correctness in the presentation of information?? • AI techniques could be considered as basis for a resource recommender system. • CBR is most suitable AI technique for SERVOGrid domain.
What is Case-Based Reasoning?(CBR in a Nutshell) • CBR is reasoning by remembering • In CBR, recommendations are made by doing reasoning from current set of cases • Classification CBR • when a similar problem description is entered most similar cases are suggested (by comparing and contrasting problem description with current set of cases) to the user as results
Conversational CBR (CCBR)(CCBR in a Nutshell) • CCBR is a type of CBR that relies on question-answer sessions to recommend most similar cases. • User interacts with the system to fill in the gaps to retrieve right cases • System responds with ranked cases and questions at each step • Question-answer-ranking cycle continues until success or failure • success: if user finds an answer to his query • failure: if no satisfactory case is found
What is a Case?(CCBR Case in SERVOGrid) • A case is composed of: • problem description: metadata concerning desired characteristics of a SERVOGrid resource, e.g., RDF triples describing a resource • solution: pointer to a resource described by metadata in problem description • A Casebase is library of cases generated from file store of RDF files each representing a case.
CCBR Case with RDF Representation CCBR CASE RDF Triple Solution Problem RDF Triple RDF Triple = (Subject, Predicate, Object)
CCBR Recommender System • Ranking of the cases • Cases will be ranked based on their consistent triple numbers. • If the case has a matching triple, it will have higher ranking. • If the case does not have the entered triple, its ranking won’t change, unless user wants the cases which don’t have this triple. • Ranking of the questions • Ranking can be based on (property, property value) appearance # in the triples stored in the case base. • System must recommend good starting points for user specification of servoObject class properties.
CCBR Recommender System CCBR CASEBASE A Case from CASEBASE Query Case A B Feature 1 Feature 2 Feature 3 Feature 4 Feature 1 Feature 2 Feature 5 Case IF ((A.Feature1.Solution = B.Feature1.Solution) & (A.Feature2.Solution = B.Feature2.Solution)) THEN Consistency # = 2 Case = <Problem, Solution>
Recap: SERVOGrid Case Base Recommender System • goal: locating resources in a large scale environment (SERVOGrid project) • approach: • SERVOGrid ontology instances (metadata) to describe resources • Recommender system to aid metadata discovery • Conversational CBR with SW markup languages providing standard form for case representation
More Information • SERVOGrid/QuakeSim: • http://quakesim.jpl.nasa.gov/ • SERVOGrid Recommender Systems project: • http://tambora.ucs.indiana.edu/~maktas/servo/project.html • SERVOGrid Recommender Systems demo: • http://ripvanwinkle.ucs.indiana.edu:4780/cbr/selection.jsp • Publications: • http://grids.ucs.indiana.edu/ptliupages/publications/
Questions/Comments • Any questions and/or comments? • Thanks!