210 likes | 374 Views
BIG DATA A Life Sciences Perspective. Scott Novogoratz, CIO College of Veterinary Medicine & Biomedical Sciences. Infectious Disease Research Center.
E N D
BIG DATAA Life Sciences Perspective Scott Novogoratz, CIO College of Veterinary Medicine & Biomedical Sciences
Infectious Disease Research Center Among the world's leaders in researching West Nile Virus, drug-resistant Tuberculosis, Yellow Fever, Dengue, Hantavirus, Plague, Tularemia and other zoonotic and human diseases
Radiological Cancer Treatment Heavy Ion Therapy
“BIG DATA is data that exceeds the processing capacity of conventional database systems.” Ed Dumbill, Big Data, Editor in Chief
Omics & Ologies -Life Sciences BIG DATA Omics • Genomics, • Transcriptomics, • Proteomics, • Metabolomics, • Metagenomics BIG DATA Devices • Gene sequencing • Mass spectrometry • Imaging • Microarrays • Liquid chromatography Ology(ies) • Radiology • Gastroenterology, • Cardiology, • Pathology
Increases due to: • Avg. Size/Study • More Digitized Data • Pathology • Endoscopy • Pictures • More Imaging Procedures Medical Imaging BIG DATA Demands
Radiographs Canine Hip Dysplasia
Endoscopy Canine Duodenum Endoscopy Procedure
How Big is a Genome? E.Coli 4 Million Base Pairs Human 3 Billion Base Pairs Paris Japonica 152 Billion Base Pairs
The scale of biological data is exponentially increasing with sequencing technologies now producing data at a rate exceeding growth in computing power predicted by Moore’s Law (10,000-fold improvement in sequencing vs. 16-fold improvement in computing From the Big Data article Unraveling the Complexities, Higdon et al
What Do Life Science Researchers Want? • Reliable Data • Statistically Valid Results • Analysis Tools with User-Friendly I/F • Transparent Reporting of Results • Ability to Share Data From U of Washington study to assess data & analysis needs for Life Scientists
Relative Importance for the Life Sciences • Volume • Veracity • Velocity • Variety • Value
Conclusions • Recognize that BIG DATA storage issues differ based on the purpose and use of data • Maximize the value of biological research, by improving the capability to store, catalog, share and compare research through: • Low cost and shared storage mechanisms • Universal and easy-to-use tools that provide researchers with the capability to compare their findings with libraries of information