280 likes | 445 Views
Unraveling the Holy Grail is Too Much for the Human Brain. . . We need Bioinformatics!!. Judith A. Kjelstrom, PhD Associate Director Biotechnology Program University of California, Davis. UCD is a Leader in Biology Research. Member of the prestigious
E N D
Unraveling the Holy Grail is Too Much for the Human Brain. . .We needBioinformatics!! Judith A. Kjelstrom, PhD Associate Director Biotechnology Program University of California, Davis
UCD is a Leader in Biology Research • Member of the prestigious Association of American Universities (only 62 members) • Ranked in the top 11 research institutions in the U.S. • Ranked 7th for graduate programs among public universities • Nationally ranked programs in:1)Agriculture & Plant Biology; 2) Veterinary Medicine & Animal Science; 3) Medicine; 4) Toxicology & Environmental Sciences; 5) Evolution & Ecology; 6) Microbiology; 7) Molecular & Cell Biology . . & more
UC Davis Biotechnology Program (sponsored program of OVCR) est. 1986 Martina McGloughlin (Director) Judith A. Kjelstrom (Associate Director) • Web sitehttp://www.biotech.ucdavis.edu • DEB(Designated Emphasis in Biotechnology) -PhD program • ADP (Advanced Degree Program for Corporate Employees) - PhD program • Short Courses, Seminars and ConferencesGraduate level; “State of the Art” focus. Link Industry with Academia • Training Grants: NIH and NSF sponsored • Outreach Education-K-12 &Community Colleges, Industry Partners, International Visitors, News Service, etc.
Life Sciences Informatics Program University of California Systemwide Industry-University Cooperative Research Program Funding Opportunities! Bioinformatics, Medical Informatics, Food and Agricultural Informatics, & Environmental Informatics http://lsi.ucdavis.edu Martina McGloughlin, Director Gussie Curran, Associate Director
“The two technologies that will shape the next century are biotechnology and information technology” • Bill Gates (Microsoft) • “The two technologies that will have the greatest impact on each other in the new millennium are biotechnology and information technology” • Martina McGloughlin
The Human Genome Project is nearly complete! • What does it all mean? • How can I store all this genetic code (>3 billion bases)? • How can I access related databases? • What do all these genes do? • Are there related genes in other life forms? Biologists need Help from Computer Scientists and Mathematicians!
Good Times for DNA Sequencers • Over 30 genomes have been sequenced and over 100 are in progress • E.coli & other bacteria, Yeast, Fruit Fly, Worm (C. elegans), Arabidopsis (mustard family), mouse and man (Homo sapiens) The Human Genome Project (headed by Francis Collins) and Celera (headed by Craig Venter) succeeded in getting a rough draft this summer. >3 billion base pairs = ~40,000-100,000 genes! Lots of Data to Store and Analyze
But, there are Problems with the HGP . .. • The actual sequence data makes up only 16% of the content; the other 84% is annotated. • The is leads to a number of issues: • How to structure databases for mining • How to establish control vocabularies to establish integrity of searches • What new algorithms are needed to facilitate processing and correlating the petabytes (1015 bytes) of information • How can protein function be extracted for the purposes of diagnostics, drug discovery and therapeutics
Systems Will Need a Scalable Infrastructure “Biology evolves faster than computer science or technology, Which is a Scary Truth Indeed!” Tom Slezak Bioinformaticist LLNL
DNA RNA Proteins Circuits Phenotypes Populations Central Dogma of Watson & Crick Transcription The fundamental dogma of molecular biology is that genes act to create phenotypes through a flow of information from DNA to RNA to proteins, to interactions among proteins (regulatory circuits and metabolic pathways), and ultimately to phenotypes (the living being). Groups of individual phenotypes constitute a population Translation
DNA GenBank EMBL DDBJ Map Databases RNA Development ? Gene Expression? Proteins SwissPROT PIR PDB Circuits Metabolism? Regulatory Pathways? Phenotypes Neuroanatomy? Clinical Data ? Populations Biodiversity? Molecular Epidemiology? Comparative Genomics? Fundamental Dogmafor 2000 Although a few databases already exist to distribute molecular information, the post-genomic era will need many more to collect, manage, and publish the coming flood of new findings. If this extension covers functional genomics, then “functional genomics” is equivalent to biology.
Genomics, Proteomics and Bioinformatics • Genomics = investigations into the structure and function of large numbers of genes (thewhole genome of an organism) undertaken in a simultaneous fashion. • Structural genomics = the genetic mapping, physical mapping and sequencing of entire genomes. • Comparative genomics = information gained in one organism can have application in other even distantly related organisms. This enables the application of information gained from model systems to agricultural and medical problems. • Functional genomics = Phenotype (the living being). The organization and control of genetic pathways that make up the physiology of the organism. • Genome sequencing for most organisms of interest will be complete within the near future, ushering in the so called ”Post-Genome era”
OLD: One gene, one protein and one function PROTEOMICS IS NOW THE IMPORTANT THING. . . The study of all temporal and spatial aspects of gene expression (proteins) A Paradigm Shift in 2000 • NEW: Proteome Approach for Functional Studies • database mining/bioinformatics • subtractive procedure • (1) comparison of global protein expression pattern of a cell at different state • (2) changes of the protein expression pattern is observed • (3) up-regulated or down-regulated proteins are • quantitatively evaluated • (4) proteome analysis is a highly sensitive monitor for complex metabolic and regulatory relationships of proteins
Genomics, Proteomics and Bioinformatics • Bioinformatics: “Hot Area” • Computational or algorithmic approaches to the production of information from large amounts of biological data. • Unquestionably, it will be an essential component of all research activities utilizing structural and functional genomics approaches used in molecular biology, agricultural & environmental sciences and medicine. • The emergence of “In Silico” Biology • Needs multi-disciplinary teams of: 1. Biologists and Chemists 2. Software & Hardware Engineers 3. Computer Scientists 4. Mathematicians 5. Laboratory Technicians
USERS of Information of Tools of Instrumentation In-Silico Modeling INTERPRETERS of Information DEVELOPERS* of Information of Tools of Instrumentation of Architecture/Storage Algorithms Modeling Strategies Visualization Bioinformatics - Two Views * *These people are in highest demand Per Pete Smietana, VP Lumicyte
Biological Data comes in many Forms: • DNA sequence with SNPs • mRNA (expressed sequences as cDNA) • Proteins • electrophoretic gel patterns • mass spectrometry patterns • amino acid sequence • models of tertiary structures • Cells/Tissues • in situ hybridization • antibody staining • X rays or MRI images • Microarray spots on DNA or proteins chips
The Future of Biotechnology • Advances in sequencing and genome analysis and in the associated information technology will accelerate the discovery and characterization of genes having potential utility for: • crop and livestock improvement or enhancement, • medical applications and • improving human health. Microarrays (DNA chips) representing thousands of individual genes allow very high throughput analysis of genes and gene expression patterns.
TGT AAT AGT TAT ATT TTC ATT ATA AAT TGT GTT TGT AGA CAT CAT AAA TTT AAA ACA TGG CTT TTT AAC CTG ATA AAT CCT ACG AAT ATT TGT AAT AGT TAT GTT ATT GCA GTA AGT ACC GTT TGT ATT ATA AAT TGT GTT CTG TGT AAT AGT TAT ATT TTC ATT ATA AAT TGT GTT TGT AGA CAT CAT AAA TTT AAA ACA TGG CTT TTT AAC CTG ATA AAT CCT ACG AAT ATT TGT AAT AGT TAT GTT ATT GCA GTA AGT ACC GTT TGT ATT ATA AAT TGT GTT CTG Which genes are turned off then on ? Courtesy of Dr. Young Moo Lee
DNA Chip Technology The color of the spot indicates which genes are being turned on. Yellow= gene is on(in both conditions)
Comparative Genomics will speed the Discovery Process for New Drugs: Scientific American July 2000
Genomics Initiative • Many of the 25 new genomics faculty at UCD will belong to the UC Davis Genomics Center. • To be housed in a 6-story Genome and Biomedical Sciences Facility(2001 start date) • Goal: to establish the campus as an international leader in functional and comparative genomics. Robotics and High Through-put technologies will be used. • The center will include scientists from a multitude of disciplines: medicine; toxicology, pharmacology, biomedical engineering, agriculture, mathematics and the biological & physical sciences. • It will also include a group of bioinformatics faculty members who will provide the computational biology and informatics research.
Bioinformatics Courses at UCD(offered or in planning stages) • Bioinformatics, Functional Genomics and Proteomics - Summer Short Course (sponsored by the Biotechnology Program). Pete Smietana, Senior VP of Lumicyte is Lead Instructor • Theory and Practice of Bioinformatics (ECS 124). Fundamental biological, mathematical and algorithmic models underlying bioinformatics. Dan Gusfield. Spring 2001 • String Algorithms and Computational Biology (ECS 224). Dan Gusfield. Fall 2001? • Introduction to Medical Informatics (MDI 210). Part of a graduate program. See catalog for curriculum. Fall 2000. Michael Hogarth.
Bioinformatics Courses at UCD(offered or in planning stages) • EVE/NEM 210 (New Course under review): will acquaint student to modern phylogenetic methods used to analyze sequence data. Mike Sanderson and Steve Nadler • Horizontal Gene Transfer (Genetics 210): In addition to biological mechanisms, the student may use computational analysis of sequence databases. Fall 2000. Mike Syvanen and Clarence Kado • Seminar in Molecular Genetics (Genetics 295). Graduate seminar in population and evolutionary biology. A guide to obtaining biological information off the web. Sequence Analysis is major focus. Fall 2000. Mike Syvanen and Craig Warden
Bioinformatics Courses at UCD(offered or in planning stages) • Bioinformatics: Nucleic Acid and Protein Sequence Analysis (MCB 298). Lab section was taught by Drs. David Deerfield and Hugh Nicholas from the Pennsylvania Supercomputing Center. Fall 1999. Clark Lagarias and Grace Rosenquist.
Job Opportunities in Bioinformatics are Numerous • Users & Developers are in high demand • The Major Players In Bioinformatics (Sci. Am., July 2000) • DoubleTwist (Oakland, Ca) • InforMax (Bethesda, Md) • NetGenics (Cleveland, OH) • Lion Bioscience (Germany) • Oxford Molecular Group (England) • Compugen (Israel)
Readings on Genomics & Bioinformatics • Bioinformatics: A practical guide to the analysis of genes and proteins. Andreas D. Baxevanis and B.F. Francis Ouellette (eds.). Wiley Intersience. 1998. ISBN 0-471-19196-5 • It’s Sink or Swim as a Tidal Wave of Data Approaches. Nature. June 10, 1999. Pg 517- • The Human Genome Business (a special report). Scientific American. July 2000. • Interview with Stuart Kauffman on Bioinformatics. Scientific American. June 5, 2000. • Functional Genomics. Nature Insight. Reprinted from vol 405, no 6788. June 15, 2000.
Questions???? • Please feel free to contact me later • jakjelstrom@ucdavis.edu • (530) 752-8228 • THANK YOU FOR INVITING ME TO SPEAK