0 likes | 22 Views
Biocuration is the process of collecting, organizing, and annotating biological data from various sources to make it accessible and understandable for further analysis by researchers. https://www.elucidata.io/book-a-demo
E N D
Biocuration: Breaking Barriers in the Use of Biomedical Data
Data is Available, But Not Usable Life Sciences R&D relies on 2 trillion GB data generated every year. Public data and databases are available, not usable In-house data, often caught in team level silos has low interoperability and reusability
Drug Discovery Initiatives Need High Quality Data "The value is in the data, it is not in the tools. That is the one thing, it’s a bit of a hobby horse for me. One thing I always point to in these discussions around data, don’t underestimate the amount of time and value in doing what is really often difficult and not so rewarding directly work, like cleaning data sets isn’t always fun, but it is often the most valuable thing you can do." -Dr. Jeffrey Reid, Regeneron's Chief Data Officer Public Data (scRNA-Seq) In-house Experiments (scRNA-Seq) Metadata Files (csv,txt)
Getting to This High Quality Data Pool is Not Trivial 80% of Time 20% of Time Determine relevance Download Data Files Process Raw Data QC and Curate Metadata Link Data and Metadata files Ready for Analysis Source and prepare high quality data Analysis 30 mins per dataset 60 mins per dataset 8 hours per dataset 8-16 hours per dataset 1-2 hours per dataset A scientist can spend anywhere from days to weeks per month in getting their data ready for analysis.
At Elucidata we’re flipping this 80-20 ratio by building technology to harmonize biomedical data and make them ML-Ready
Elucidata’s Biocuration Platform- Polly Making Semi-structured Biomedical Data ML-Ready Data Sources Polly Harmonization Engine Polly Harmonized Data Stored on your Atlas
Biocuration Workflow Data Acquisition Automated Metadata Curation using LLMs Manual Curation of Custom Fields Stream Harmonized Data Data collection is simplified via API or GUI upload to curation infrastructure Model assisted curation leads to higher throughput, the ability to scale to hundreds of curators Human in the Loop allows for additional custom curation and extensive QC checks Elucidata’s Data Model is data type agnostic; Data from disparate sources is made interoperable
Impact:Millions of Datasets Harmonizedin the Past 5 Years! Harmonization with Polly Applications Powered • 99% accurate, customizable & 10X faster than industry standard • Multi-Omics, Bioassays supported • Data delivered on a 360 Degree Platform (Polly), complete with APIs • Allows public and in-house data integration • On-going support for evolving data needs (Data Concierge) • Patient Stratification • Biomarker Discovery • Target Discovery, Validation & Qualification • Data Management • Knowledge Graphs • Training Models
Reach out to us at info@elucidata.io or Book a Demo with us to learn more.