760 likes | 948 Views
Meeting the Bioinformatics Challenges of Functional Genomics. VanBUG 11 September 2003. Acknowledgments. <johnq@tigr.org>. TIGR Human/Mouse/Arabidopsis Expression Team Emily Chen Bryan Frank Renee Gaspard Jeremy Hasseman Lara Linford Fenglong Liu Simon Kwong John Quackenbush
E N D
Meeting the Bioinformatics Challenges of Functional Genomics VanBUG 11 September 2003
Acknowledgments <johnq@tigr.org> TIGR Human/Mouse/Arabidopsis Expression Team Emily Chen Bryan Frank Renee Gaspard Jeremy Hasseman Lara Linford Fenglong Liu Simon Kwong John Quackenbush Shuibang Wang Yonghong Wang Ivana Yang Yan Yu Array Software Hit Team Nirmal Bhagabati John Braisted Tracey Currier Jerry Li Wei Liang John Quackenbush Alexander I. Saeed Vasily Sharov Mathangi Thiagarajan Joseph White Assistant Sue Mineo The TIGR Gene Index Team Foo Cheung Svetlana Karamycheva Yudan Lee Babak Parvizi Geo Pertea Razvan Sultana Jennifer Tsai John Quackenbush Joseph White H. Lee Moffitt Center/USF Timothy J. Yeatman Greg Bloom PGA Collaborators Gary Churchill (TJL) Greg Evans (NHLBI) Harry Gavras (BU) Howard Jacob (MCW) Anne Kwitek (MCW) Allan Pack (Penn) Beverly Paigen (TJL) Luanne Peters (TJL) David Schwartz (Duke) Emeritus Jennifer Cho (TGI) Ingeborg Holt (TGI) Feng Liang (TGI) Kristie Abernathy (mA)Sonia Dharap (mA)Julie Earle-Hughes (mA)Cheryl Gay (mA)Priti Hegde (mA)Rong Qi (mA) Erik Snesrud (mA) Heenam Kim (mA) TIGR PGA Collaborators Norman Lee Renae Malek Hong-Ying Wang Truong Luu Bobby Behbahani Funding provided by the Department of Energyand the National Science Foundation Funding provided by the National Cancer Institute,the National Heart, Lung, Blood Institute,and the National Science Foundation TIGR Faculty, IT Group, and Staff
<johnq@tigr.org> Acknowledgments Thanks to Syntek, Inc. <http://www.syntek.com>for GeneShaving MeV module and assistance with MyMADAM Thanks to DataNaut, Inc. <http://www.datanaut.com>for RelNet and Terrain Map modules and assistance with Client/Server MeV <tm4@tigr.org>
Science is built with facts as a house is with stones – but a collection of facts is no more a science than a heap of stones is a house. – Jules Henri Poincare
There are 1011 stars in the galaxy. That used to be a huge number. But it's only a hundred billion. It's less than the national deficit! We used to call them astronomical numbers. Now we should call them economical numbers. - Richard Feynman, physicist, Nobel laureate (1918-1988)
Steps in the Process Select array elements and annotate them Build a database to manage stuff Print arrays and manage the lab Hybridize and analyze images; manage data Analyze hybridization data and get results
Steps in the Process Select array elements and annotate them Build a database to manage stuff Print arrays and manage the lab Hybridize and analyze images; manage data Analyze hybridization data and get results
TIGR Gene Indices home page www.tigr.org/tdb/tgi ~60 species >16,000,000 sequences
TGICL Tools are available – with more coming Geo Pertea Razvan Sultana Valentin Antonescu Available with source
Gene Index Assembly process ESTs from GenBank (dbEST) Expressed Transcripts (ET) from GenBank CDS TIGR ESTs remove vector, poly-A, adapter,mitochondrial and ribosomal sequence reduce redundancy High stringency pair-wise comparisons to buildClusters Each cluster is assembled to obtainTentative Consensussequences (TCs) Annotate TCs and release
GO Terms and EC Numbers Babak Parvizi
The TIGR Gene Indices<http://www.tigr.org.tdb/tdb/tgi> Dan Lee, Ingeborg Holt
Building TOGs: Reflexive, Transitive Closure And Paralogues Tentative Orthologues Thanks to Woytek Makałowski and Mark Boguski
is easy! Gene Finding in Humans Razvan Sultana
is easy? Gene Finding in Humans Razvan Sultana
is difficult? Gene Finding in Humans Razvan Sultana
is difficult? Gene Finding in Humans A genome and its annotation is only a hypothesis that must be tested. Razvan Sultana
RESOURCERER Jennifer Tsai http://pga.tigr.org/tools.shtml
RESOURCERER: Using Genetic Markers Just added: Integrated QTLs
Steps in the Process Select array elements and annotate them Build a database to manage stuff Print arrays and manage the lab Hybridize and analyze images; manage data Analyze hybridization data and get results
SOPs are available PCR purification cDNA/template prep RNA labeling Printing Hybridization <http://pga.tigr.org/tools.shtml> Coming: Data QC SOP
What data should we collect? Nature Genetics 29, December 2001 MAGE-ML – XML-based data exchange format <http://www.mged.org> EVERYTHING
What’s Wrong with MIAME? • MIAME was designed as a model for capturing information necessary to create public databases. • MIAME-based databases lack LIMS capabilities, which are necessary for large-scale studies. • We do not want to store images in our database for practical reasons – limited space. • We needed to develop a variety of tools adapted to our existing infrastructure and legacy data and databases. • Probes are labeled and applied to the arrays • An “experiment” is a hybridization • A “study” is a collection of hybridization experiments
Clone Probe Slide_type Protocol Primer_pair Primer New_plate Slide PCR Study Probe_source Experiment Expression Expt_probe Hyb Spot Gene Scan Analysis Normalize Conceptual Schema: MAD
MADAM: Microarray Data Manager Marie-Michelle Cordonnier-Pratt, UGA converted MySQL to Oracle and made MADAM work! Available with source and MySQL
Steps in the Process Select array elements and annotate them Build a database to manage stuff Print arrays and manage the lab Hybridize and analyze images; manage data Analyze hybridization data and get results
Microarray Slide (with 60,000 or more spotted genes) Microtiter Plate Microbial ORFs + Design PCR Primers PCR Products Eukaryotic Genes Select cDNA clones Many different plates containing different genes For each plate set, many identical replicas PCR Products Microarray Overview I
PCR Scorer Reads/loads primer data file to MAD and allows PCR data entry,and translation of 96 384.(Alex Saeed, developer and maintainerenhancements: Wedge Smith) Selected Genes Primer Design Clone Selection Primer Synthesis PCR Amplification MAD Gel-based Scoring Microarray Overview
The Beast: Microarray Robot from Intelligent Automation <http://www.ias.com>
Additional Software for Arrays: Scheduler Microarray SchedulerAllows scheduling of all instruments Designed and maintained by Jerry Li Available with source
SliTrack/ControllerTakes Slide Order and Run parameters,generates spot order,IAS control file,launches IAS run software,loads database.(J. Li, developer and maintainer) Amplified/Purified Genes Loaded in Arrayer Run Parameters Set Slides Printed MAD Microarray Overview
Steps in the Process Select array elements and annotate them Build a database to manage stuff Print arrays and manage the lab Hybridize and analyze images; manage data Analyze hybridization data and get results
Microarray Overview II Measure Fluorescence in 2 channels red/green Control Hybridize, Wash Analyze the data to identifypatterns ofgene expression Test Prepare Fluorescently Labeled Probes
Microarray Overview II Measure Fluorescence in 2 channels red/green Weed Control Hybridize, Wash Analyze the data to identifypatterns ofgene expression Test Bush Prepare Fluorescently Labeled Probes
Microarray Overview II Measure Fluoresence in 2 channels red/green Control Hybridize, Wash Obtain RNA Samples Analyze the data to identify differentially expressed genes Test Prepare Fluorescently Labeled Probes
Microarray Overview MADAMAllows data entry (J. Li & J. White, web prototype;A. Saeed, J. White, J.Li, & V. Sharov, developers) Control Test Hybridize, Wash Prepare Fluorescently Labeled Probes MAD Obtain RNA Samples