1 / 26

Oklahoma and Kansas BRICNET: a B ioinformatics R esearch I nspired C yber NET work

Oklahoma and Kansas BRICNET: a B ioinformatics R esearch I nspired C yber NET work. Presenter: Rakesh Kaundal Oklahoma State University Oklahoma EPSCoR Track II Oklahoma State Regents for Higher Education Office, Oklahoma City Wednesday November 16, 2011.

hedda
Download Presentation

Oklahoma and Kansas BRICNET: a B ioinformatics R esearch I nspired C yber NET work

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Oklahoma and Kansas BRICNET: a Bioinformatics Research Inspired Cyber NETwork Presenter: Rakesh Kaundal Oklahoma State University Oklahoma EPSCoR Track II Oklahoma State Regents for Higher Education Office, Oklahoma City Wednesday November 16, 2011

  2. Oklahoma EPSCoR CI(Status and Goals) • RII Track-1: Building Oklahoma’s Leadership Role in Cellulosic Bioenergy • RII Track-2: A cyberCommons for Ecological Forecasting (OK, KS collaboration) • RII Cyber Connectivity (C2): Enhancement of inter-campus and intra-campus cyber connectivity and broadband access within an EPSCoR jurisdiction: Oklahoma Optical Initiative • Oklahoma Cyberinfrastructure Initiative (OCII): MoU between OU & OSU, OneNet connectivity • NSF Major Research Instrumentation (MRI) • Deploying Oklahoma PetaStore (PI: H. Neeman, OU) • New HPC cluster COWBOY (PI: D. Brunson, OSU)

  3. Existing Cyberinfrastructure Many universities/institutes with intersecting bioinformatics needs • Education: OK universities • Biology, Computational Science, Engineering & other STEM: faculty at most OK schools • High Performance Computing: HPCC (OSU), OSCER (OU), TCS (TU); faculty at OU, OSU, TU • High Performance Networking: OneNet, NLR, Internet2 • Sensor Networks: Mesonet; CASA; CyberCommons, Eddy Flux towers, Biosensor projects; faculty at OU, OSU, TU • Grid Computing: OSCER(Computing), OCHEP(Physics), LEAD(Meteorology); BRICNET (Bioinformatics)? OK-BioGRID (Bioinformatics)?

  4. BioinformaticsWhy it’s useful… • All of the information needed to build an organism is contained in its DNA. If we could understand it, we would know how life works. • Preventing and curing diseases like cancer (which is caused by mutations in DNA) and inherited diseases. • Curing infectious diseases (everything from AIDS and malaria to the common cold). If we understand how a microorganism works, we can figure out how to block it. • Understanding genetic and evolutionary relationships between species • Understanding genetic relationships between humans. Projects exist to understand human genetic diversity • Similarly, other Eukaryotes are being sequenced including plants, e.g. to understand plant diseases, their tolerance under stress conditionsetc. • Prokaryotes, Metagenome sequencing……. Susceptible Resistant Abiotic stress

  5. BioinformaticsWhy it’s useful… Complete Understanding of a System Image courtesy: Center for Biological Sequence Analysis, DTU, Denmark

  6. Why Bioinformatics CI is neededThe sequencing pace… • Nucleic acid sequences • Genbank (April 2011) http://www.ncbi.nlm.nih.gov/genbank/ • 126,551,501,141 bases in 135,440,924 sequence records in the traditional GenBank divisions • 191,401,393,188 bases in 62,715,288 sequence records in the Whole Genome Sequencing • Entire genomes • GOLD Release V.2 (Oct 2011) contains ~2000 completely sequenced genomes. • http://www.genomesonline.org/gold_statistics.htm • Protein sequences • Essentially obtained by translation of putative genes in nucleic sequences (almost no direct protein sequencing). • UniProtKB/TrEMBL (2011) contains 17 million of protein sequences. • http://www.ebi.ac.uk/swissprot/sptr_stats/index.html

  7. Data Explosion!!!!! Biological data production is in terabytes and increasing everyday…….

  8. Multidisciplinarity molecularbiology Bioinformatics & Computational Biology genomics mathematics genetics statistics biochemistry numerical analysis biophysics algorithmics evolution image analysis datamanagement

  9. What is Cyberinfrastructure? Resources and capabilities that enable high end computing & communications for science, engineering and technology that solve real world problems. • Education:Postsecondary, K-12 • High Performance Computing • High Performance Networking • Computational Science & Engineering • Grid Computing • Scientific Visualization • Sensor Networks • Shared Instruments • Shared Databases

  10. What is CI for Bioinformatics? Resources and capabilities that enable high end computing & communications for bioinformatics & computational biology that solve real biological problems. • Education:Postsecondary, K-12 • High Performance Computing • High Performance Networking • Computational Science & Engineering • Grid Computing (BioinfoGRID) • Scientific Visualization (gene regulatory networks, host-pathogen) • Sensor Networks (disease prediction) • Shared Instruments • Shared Databases

  11. BRICNET Collaborative Cyber-enabled Scientific Themes(relevant to Oklahoma and has National importance)

  12. Global Warming and Rhizosphere Metagenomics • The Role of Microbes in Maintaining Atmospheric Carbon Dioxide Balances

  13. Global Warming and the Rhizosphere Community • The rhizosphere, the region of soil immediately surrounding the plant root, is directly influenced by root exudates and associated microorganisms • Because CO2 plays important role in global climate change, understanding rhizosphere microbial community dynamics is fundamental to the plant health and fate of carbon

  14. SwitchgrassRhizosphere Community Metagenome (leverage current EPSCoR Track 1 on Cellulosic Bioenergy) • We will grow switchgrassunder varied CO2 levels in controlled environments to measure impacts on microbial diversity and functional capacity using metagenome analysis. • Masssive sequence information will be generated. Bioinformatics toolswill be developed; critical for analysis and comparison of sequence data

  15. Bioinformatics for Gene – Phenotype BRICNET for Bioenergy GENE LEVEL Phenotype level Quantitative Genetics Marker-assisted selection CROP PHENOTYPE QTL Marker Technology MOLECULAR MARKERS DNA SEQUENCES G X E M X Molecular Genetics CROP MODEL Process Level Functional Genomics GENE FUNCTION AND NETWORK CELL MODEL Pathway Level TRANSCRIPTS PROTEINS METABOLITES Phenomics GENOME CELL CROP

  16. Gaps in current CI (Example: current Track 1 on Bioenergy) DNA sequence -------------------------------------------------------------------Phenotype • Outsourcing of Data Analysis • Sequence, Microarray, Genomic-SSRs, EST-SSRs, miRNA analysis • miRNA/siRNA: 20-30 million reads • Washington University, St. Louis China • Sequence processing and analysis • Switchgrass ESTs: 1 million reads (454 seqn) • Oregon University Danforth Center, St. Louis (MO) • Assembly and annotation • Lack of infrastructure / trained bioinformatics personnel

  17. Gaps in current CI DNA sequence -------------------------------------------------------------------Phenotype • Outsourcing of Data Analysis • Metagenomics • Data assembly, classification, annotation: • Pittsburgh Supercomputing Center • Current resources are over whelmed • Lack of integration of data streams from several disciplines • Time lag – Weeks to months and queuing at peak demand

  18. Bioinformatics for Biosecurity A bioinformatics approach to understand host-pathogen interactions Effectors: Bacterial proteins that are injected into the host cell through a type III secretion system to manipulate host cells. Ralstoniasolanacearumis on of the world’s most important plant pathogen that has very wide host range and cause significant losses to agriculture Burkholderiapseudomalleiis broad host range bacteria that causes melioidosis disease in humans and various live stocks. Interestingly, it can also infect few plant species. Goals: Use a bioinformatics (machine learning) approach to identify host proteins that could potentially interact with bacterial effectors. Experimentally validate the potential interactions using techniques like Yeast-two-hybrid analyses and bimolecular fluorescence complementation.

  19. Integration of Scientific Themes into CI Framework BRICNET Collaborative OSUHPCC OUTREACH DELIVERABLES OK BioGRID CYBER-ENABLED RESEARCH THEMES Decision Support Tools Sensor Networks (Micro Climate) Disease Forecasting Researchers SR NF Storage Server BRICNET Eddy Flux (Macro climate) Software Bioenergy End Users Microbiome sequencing Algorithms App Developers Biosecurity Sequence Data (ACGT, OU) Visualization Tools Servers Community Biomedicine Sequence Data (OMRF) Databases OSCER Higher Ed. Systems Biology Sequence Data cyberCommons Development of BCB resources for Research, Education and OutREACH

  20. University of Oklahoma (8) Botany:L.E. Bartley Chemistry & Biochemistry:B.A. Roe, F. Najar, S.W. Clifton Computer Information Sciences:H. Neeman Microbiology: T. Conway, J. Grissom Oklahoma Climatological Survey: J.B. Basara Samuel Roberts Noble Foundation (2) Bioinformatics: P.X. Zhao Plant Biology: K.S. Mysore Oklahoma Medical Research Foundation (1) Clinical Immunology: J.D. Wren Cameron University (1) Biology: L. Peal Oklahoma State U, Stillwater (15) Biochemistry: R. Kaundal, U. Melcher, P. Hoyt, M. Mahalingam Botany: M. Palmer Computer Science: S. Kak Industrial Engineering: B. Balasundaram, S. Bukkapatnam Information Technology: D. Brunson Microbiology: B. Fathepure Plant Pathology: J. Fletcher, S. Marek Plant & Soil Sciences: G. Kakani, M. Anderson Statistics: M. Payton Oklahoma State U, Tulsa (1) Center for Health Sciences: R. Kaul Langston U (3) Biology: K.J. Abraham, G. Naidoo Computer Information Sciences: P.F. Tiako Oklahoma City U (1) Computer Science: K. Sha BRICNET Participants(Oklahoma)

  21. OUTREACH Develop BioREACH program vercome O U T R E A C H nderstanding raining esearchers nd users pp developers ommunity igher education

  22. Summer Schools • Summer 2013, 2014, 2015 • 1 week @ each participating institute (rotation-wise) • Lectures plus hands-on exercises to students • Students of differing backgrounds (Bio + CS), minorities • Reaching a wider audience • Lectures, exercises, video, on web • More tutorials, 3-4/year • Students, postdocs, scientists • Agency specific tutorials

  23. Summary CI for Bioinformatics: • Enables large, lasting improvements in education, research, intrastate collaboration and economic development across Oklahoma • Leverages existing resourcesacross Oklahoma • Alignswith objectives of individual researchers, teams, the state, other OK RII themes, and the NSF • Spreads the capabilities without diluting the focus • Lots of NSF funding opportunities

  24. Thank You for your Attention! BRIC N E T ? ECFN

More Related