270 likes | 510 Views
Crux. flexible, structured data reporting for funding agencies. The Challenge. MJFF has a comprehensive, large-scale funding program (~40 programs, ~680 funded projects) Final reports are written documents, summarized as abstracts. e.g., Prochaintz & Harmann (Fast track 2007):
E N D
Crux • flexible, structured data reporting • for funding agencies
The Challenge MJFF has a comprehensive, large-scale funding program (~40 programs, ~680 funded projects) Final reports are written documents, summarized as abstracts. e.g., Prochaintz & Harmann (Fast track 2007): ‘… the protein Engrailed is able to rescue the neurons most affected in Parkinson’s disease …’ * What is the data supporting this claim? * From MJFF Website
Current state-of-the-art ‘Data Sharing’ File sharing systems provide centralized, secure access to files But this is ‘just’ a complex shared directory. * How to keep track of the science easily and efficiently? The https://brainfu.org site * with apologies to the brainfu designers.
Capturing Scientific Communication • Two primary areas of communication breakdown • Experimental Procedures • Experimental Data
Experimental Procedures • Poorly described at the beginning (grant proposal) and end of the process (publication) • Difficult to replicate experiments • Difficult to design the next experimental step • Difficult to track changes
Experimental Data • Only summary data available during grant assessment and after publication • Difficult to evaluate progress during grant assessments • Greatly diminishes utility of meta-analysis • Results are decoupled from the supporting raw data
Data Files • Lab Notebooks, Computer Files, Images, Spreadsheets, Text Documents, etc. • Sometimes imported into ad-hoc summary databases • Data decoupled from experimental procedure
Inconsistency + Evolution • Over the course of any project: • Procedures and data formats may change • Original requirements may change or be refined • Data schema will be updated to reflect this but what happens to data in previous versions of the database?
Why? • Why is this happening? Capturing and communicating this information using currently available methods is burdensome and haphazard. • Why is this a problem? Too much critical data stays in the researcher’s head and private notebooks.
What Is Crux? • A user friendly tool to record all critical data at point of funding and throughout the course of research. • Use standard terminology to describe data • Precise curation using community bio-ontologies • Formats are easily understood by scientists and Funders • Data, procedures, and experiment designs are permanently linked and stored in a unified file system.
This presentation • Walkthrough • Crux web-application • Experimental design tool • OBI ontology curation • Designs and experimental data for Emborg + Codman • Upload / download of data using spreadsheets • Demonstration • Enter a new experimental design LIVE! • Overview of workload associated with model construction, data entry and ontology curation • Overall Accomplishments • Preliminary description of plans for year 2
How it Works – A Walkthrough • a demonstration of experiment management with crux *image by webtreats
1. Dashboard • Serves as a focal point summarizing the contents of the system • Each experiment listed separately • Links to: • metadata • design • data
2. Experimental Design • The Experimental Design Diagram • Protocol represented as a diagram • Clear complete description of the experiment
From Experimental Design to Data Design Dependency between measurements and parameters is provided by tracing backwards through the protocol. Note that model is not complete, but works well for simple measurements + analysis Codman in-vivo
Simple Data Gathering • Low burden on the investigator • Spreadsheets used as the data forms • Files and images are uploaded into Crux with the data
Experiments in the demo Experiments in system: • Codman - in-vitro- in-vivo • Emborg - in-vitro- ex-vivo- in-vivo
Standardized Terminology • Professional curation produces accurate and clearly understandable experiment designs • Using NIH-funded community repositories of curation terms enables data mining and sharing • Crux allows a curator to associate professionally-curated terms with experimental elements
Curation In Parallel • Reviewers and Investigators are not burdened by the curation process • The system gets more and more useful as the curation gets more detailed
A Complete Solution • Clear evolvable experimental descriptions • Automatic database creation • Data import through spreadsheets • Data query, report and export Describe Gather Query
Demo Walkthrough –From Design to Data SYSTEM DEMONSTRATION
M.J.F.F. Programs Experimental Programs: Discovery / Translational / Clinical / (+ some I.T. development, such as this project). The long-term objective of this system is to enable any scientist in an executive role (program officers, journal editors, foundation executives) to do more than address the question: ‘were the goals of the project fulfilled?’ We want to provide data management, sharing and analysis so that these individuals can guide and accelerate research by (a) gaining access to all the data (b) performing meta-analysis and knowledge synthesis over that data.
Further Development • The 2010 system was funded as a proof-of-concept prototype • Future funding targets Beta software for use at MJFF, the Kinetics Foundation and other agencies if possible • Developers will work closely with beta testers in MJFF and The Kinetics Foundation • Crux will also be developed and distributed as a capability within the Biomedical Informatics Research Network (BIRN, http://www.birncommunity.org/), a community organization building infrastructure for the biomedical community.
Year 2 Plans • Development of a system tailored to manage a well-specified set of grants pertaining to a family of experiments conforming to a well-developed experimental design (i.e. infusion studies) • ‘Dashboard’ display will form central focal point and be based on a summary of designs, evolution of these designs and the presence of data within them • Proposed additional features will include: - data visualization - structured coordination with professional curators- inter-experiment querying
Thank You *image by wasabicube