120 likes | 248 Views
Phase II Additions to LSG. Search capability to Gene Browser Though GUI in Gene Browser BLAST plugin that invokes remote EBI BLAST service Working set manager State retention between sessions Provenance viewer Displays annotated provenance Accepts RDF representations.
E N D
Phase II Additions to LSG • Search capability to Gene Browser • Though GUI in Gene Browser • BLAST plugin that invokes remote EBI BLAST service • Working set manager • State retention between sessions • Provenance viewer • Displays annotated provenance • Accepts RDF representations
Knowledge Discovery through Provenance Collection, Representation, and Use in the Life Science Grid (LSG)Phase II Final Report : Architectural and Technical Details Beth Plale Director, Center for Data and Search Informatics Indiana University
Key contribution to LSG proper (provenance aside) Introduction of state
LSG Space user Annotated Provenance Graph Working Set Manager RDF Viewer Lilly CAB Bus S-OGSA Service Entrez Karma Framework Provenance Graphs OPM* RDF Interface Gene Ontology KarmaServices Lilly to Karma Events Reflector Provenance DB Semantic Binding Gene Browser BLAST Events Capture Karma structure ontology Proxy Karma Events Bus (WS+LSG) Services and data ontology (myGrid) SAWSDL Registry Service Annotations Resources (public + private) *Open Provenance Model v1.01
Demo 1: Phase II Use Case • Select “gene” from database list, list will show in Gene Browser • Submit gene to NCBI • Open tab of BLAST plugin, download FASTA sequence • Run BLAST • Add results to Working Set • Annotate Working set
Working Set Manager listens to CAB bus. It uses Entrez ID and/or all or partial of BLAST result as input to working set, or imports csv file into working set. Working set WS2 was generated from working set WS1 by Delete Rows. WS2 can be exported as csv file.
Demo II: Query provenance database Query BLAST related data. Query 1: get the latest Blast_Plugin. create or replace view v1 as select process_id, service_id, process_initialization_time from process where service_id like '%Blast_Plugin' and process_initialization_time = (select max(process_initialization_time) from process where service_id like '%Blast_Plugin’) select * from v1; Query 2: get the service (Blast_Ebi_Web_Service) invoked by Blast_Plugin. select invoker_id, invokee_id, p.service_id from invocation, process p, v1 where invoker_id = v1.process_id and invokee_id = p.process_id;
Demo II: Query provenance database Query 3: get input to Blast Ebi Web Service select p.service_id, artifact_id,artifact_value from artifact_used au,artifact a, process p, v1 where au.artifact_no = a.artifact_no and au.process_id = p.process_id and p.process_id = v1.process_id + 1 and p.service_id like '%Blast_Ebi_Web_Service’; Query 4: get output from Blast Ebi Web Service select p.service_id, artifact_id,artifact_value from artifact_generated ag,artifact a, process p,v1 where ag.artifact_no = a.artifact_no and ag.process_id = p.process_id and p.process_id = v1.process_id + 2 and p.service_id like '%Blast_Ebi_Web_Service’;
Suggested Future Work • Next steps: • Engagement of users or, • Expanded functionality set • Build in development time if we must do this ourselves • Represent to user combined visualization and process provenance • Write research quality paper • Requires user study, or comparison, or … • Formally integrate provenance collection tools into non-public LSG
Suggested Future Work: Technical • Support BLAST in asynchronous mode, • extend NCBI Entrez to work on other NCBI databases, and • design rich provenance queries.