370 likes | 391 Views
From the National Center for Biotechnology Information (NCBI), http://www.ncbi.nlm.nih.gov/. "Understanding nature's mute but elegant language of living cells is the quest of modern molecular biology.
E N D
From the National Center for Biotechnology Information (NCBI), http://www.ncbi.nlm.nih.gov/ • "Understanding nature's mute but elegant language of living cells is the quest of modern molecular biology. • From an alphabet of only four letters representing the chemical subunits of DNA, emerges a syntax of life processes whose most complex expression is man. • The unraveling and use of this ‘alphabet’ to form new ‘words and phrases’ is a central focus of the field of molecular biology. The staggering volume of molecular data and its cryptic and subtle patterns have led to an absolute requirement for computerized databases and analysis tools. • The challenge is in finding new approaches to deal with the volume and complexity of data, and in providing researchers with better access to analysis and computing tools in order to advance understanding of our genetic legacy and its role in health and disease."
NCBI: Database Search and Retrieval Entrez • http://www.ncbi.nlm.nih.gov/Entrez/ It provides access to: • PubMed • Nucleotides: GenBank, dbEST • Proteins PDB • Structures • Genomes • PopSet • And more………..
NCBI Tools for Data Mining: BLAST (most popular tool) • http://www.ncbi.nlm.nih.gov/BLAST/ • program for sequence similarity searching • instrumental in identifying genes and genetic features • BLAST can execute sequence searches against the entire DNA database in less than 15 seconds. • Good for local pairwise alignments but not multiple alignments or global pairwise alignments
TIGR is a good public database for looking at gene sequences from a number of species.. This allows scientists to do comparative genomics (look for similarities in the DNA of other species) www.tigr.org
Comparative Genomics will speed the Discovery Process for New Drugs: Scientific American July 2000
Get Schooled for Bioinformatics: • Biology • Know basics & Have sense of biological experimentation and public databases (NCBI, TIGR, etc.) • Computer Science • Programming (C/C++/Perl scripting) • Database construction (UNIX/LINUX) • Algorithm design • Math/Statistics • Probability, Experiment design, Machine learning • Ethics • “Core Bioinformatics” • LIMS (lab information management sys) • EST clustering • Sequence analysis & annotation, etc per Russ Altman of Stanford’s Biomedical Informatics Training Program
1. USERS (AS/BS level) of Information of Tools of Instrumentation In-Silico (Computer) Modeling 2. INTERPRETERS (BS/MS) of Information 3. DEVELOPERS (PhD)* of Information of Tools of Instrumentation of Architecture/Storage of Algorithms of Modeling Strategies of Visualization Methods Bioinformatics – Three Levels Per Pete Smietana, PhD (bioinformaticist) *These people are in highest demand
Bionformatics is a Hot Topic at UC Davis • The Biotechnology Program with ARC received an NSF grant to teach the basics to Community College instructors. • Dr. Craig Benham was hired in 2001 as the director of Bioinformatics for the new Genome Center. • CGF (Genome Facility for the College of Ag) is a cutting edge facility: microarrays, high through-put analysis, bioinformatics, etc. • Offer a Masters in Medical Informatics, with other degrees in the works.
Proteomics - What is it? • A proteome is the entire PROTein complement expressed by a genOME or a cell tissue type. • Only one genome for an organism, but the Proteome changes under different expression conditions. • Proteomics is the study of the proteins expressed from a genome. Proteomic researchers determine variations in proteins expressed due to disease, drugs environment, and time course. Per Tina Settineri, Applied Biosystems
The Challenge of Proteomics • Multiple Proteins for each Gene due to splicing • Varied and fragile nature of proteins • Quantitative and Qualitative changes of the proteome • Structural and Functional Proteomics Studies Complex Proteome(s) Per Tina Settineri, Ph.D., Applied Biosystems
Problem is that over 40% of Human genes have Unknown Function! Of 26,383 gene predictions: 41.7% unknown function Science v. 291, Feb. 16, 2001 per Dana Haley- Vicente Protein Bioinformatics
TGT AAT AGT TAT ATT TTC ATT ATA AAT TGT GTT TGT AGA CAT CAT AAA TTT AAA ACA TGG CTT TTT AAC CTG ATA AAT CCT ACG AAT ATT TGT AAT AGT TAT GTT ATT GCA GTA AGT ACC GTT TGT ATT ATA AAT TGT GTT CTG TGT AAT AGT TAT ATT TTC ATT ATA AAT TGT GTT TGT AGA CAT CAT AAA TTT AAA ACA TGG CTT TTT AAC CTG ATA AAT CCT ACG AAT ATT TGT AAT AGT TAT GTT ATT GCA GTA AGT ACC GTT TGT ATT ATA AAT TGT GTT CTG Which genes are turned off then on ? Courtesy of Dr. Young Moo Lee
What is Protein Identification and how is it done by mass spectrometry? Starts with proteins separated by 2-dimensional gel electrophoresis: IEF followed by SDS PAGE Each black spot is a protein Spots will excised and analyzed Tina Settineri, Applied Biosystems
Biospectrometrist 1. Interview biologist who isolated the protein 2. Cleave protein with trypsin to obtain peptide mixture 3. Analyze peptide mixture by MALDI-TOF MS toget peptide molecular weights! Police Officer 1. Interview witnesses 2. Dust for fingerprints GATHER EVIDENCE enzyme Tina Settineri, AppliedBiosystems
Police Officer Height: 5’7” Weight: 160 lbs Gender: male Age: 35-40 Fingerprints Biospectrometrist molecular weight: 30,000 Origin: bovine liver Peptide mass list from MALDI-TOF analysis: 975.4832, 1112.5368, 632.3147, 803.4134, 764.3892 DATABASE SEARCH search search DATABASE OF KNOWN FELONS PEPTIDE MASS DATABASE OF KNOWN PROTEINS
Mass accuracy tolerance = 15 ppm This means that the mass is within 0.015 Da at m/z 1000 Per Tina Settineri, Ph.D., Applied Biosystems
Protein Structure is … “Protein structure is the key to understanding disease, because proteins (not genes) are the end points of life sciences investigation since proteins ultimately regulate metabolism and disease.” Chemical & Engineering News, Feb. 19, 2001 p. 30. Protein Bioinformatics
Similar Function, Similar Structure, Low Sequence Homology Methyltransferases Only 5.5 % sequence identity Molecular Simulations Inc. San Diego, CA WebLab Viewer Pro Need more than a BLAST search! Protein Bioinformatics
Protein Structures – Predicted vs. Experimental Methylglyoxal synthase predicted (bottom) and experimental (top) structures agree exactly (redregion) – NIH National Center for Biotechnology Information In Silico Modeling is an important skill C & E News Sept. 2000
Sequence to Structure to Function: Homology Implies Function 3D Active Site Most 3D Structure Secondary Structure Evolutionary Conservation Protein Sequence DNA Sequence Least (databases) Molecular Simulations Inc. San Diego, CA
PubGene - example of use of bioinformatics for genomics & proteomics • PubGene" Solutionpubgene@technologynetworks.net"PubGene searches for various relationships between subjects such as genes, RNA transcripts, proteins, mutations, chemical compounds,or diseases. "PubGene includes a number of data layers to allow user-defined content display, and presents results as association networks. • PubGene" PackageText Mining Module: proprietary databases of human, mouse, and rat gene and protein names and aliases associations with interface for literature search.Expression Analysis Module to look for gene and protein relationships in a microarray experimentSequence Homology Module to search for protein homology networks. • Future ProductsMutationGeneProteinDisease relationshipsMeltMap (human genome) projectDistributed by Technology Networks:
Metabolomics = metabolism +genomics • Systematic Metabolic profiling may identify how genes regulate a metabolic process, such as cold tolerance or oil production in plants. • Metanomics in Berlin, Germany is creating the MetaMap database for Arabidopsis (see Working Weeds. Scientific Am. April 2003. • Involves genetic modification of organisms versus wild type, then rapidly screening them for chemical or physical changes after a change in experimental conditions. • Lipomics Technologies in West Sacramento is creating lipid profiles in humans and lab animals (health vs. disease or drug therapy)
Pharmacogenomics could lead to personalized medicines. Ref. Laboratory Medicine. Sept 2003. • It is the study of how an individual’s genetic inheritance affects the body’s response to drugs, with the goal of optimizing therapy and identifying better, safer drugs. • Single Nucleotide Polymorphisms (SNPs) will be the basis of classification (Connect phenotype with genotype). • HapMap is an international project to look at patterns of inheritance of SNP haplotypes. • Although several pharmacogenetic syndromes are monogenetic traits, most are polygenetic. • Example: isoniazid is the therapy for choice for TB, but some develop neuropathy due to a single gene variation. • We need to couple the Human Genome Project with functional genomics and bioinformatics to unravel the puzzle of why some drugs work for some people, while others become toxic..
Nutritional genomics or “Nutrigenomics" • January 21, 2002 – UC, Davis and the Children's Hospital Oakland Research Institute (CHORI) become a National Center of Excellence in Nutritional Genomics. A five-year, $6.5 million grant from the National Center on Minority Health and Health Disparities, a division of the National Institutes of Health. • The Goal of the Center: Explore the links between diet, genes and diseases in minority populations, such as Type 2 diabetes, obesity, heart disease and some cancers. • "The research we'll be doing in the Nutrigenomics Center is one of the first examples of taking the benefits of human genome research from the lab to the home," said Ray Rodriguez, professor of molecular and cellular biology at UC Davis and director of the new center. • Good reference: “We are what we eat”, The Economist, Sept 4, 2003 • For more information: http://nutrigenomics.ucdavis.edu/Media contact: Raymond Rodriguez- (530) 752-3263, rlrodriguez@ucdavis.edu
Agricultural Biotechnology is a promising new area of crop science ... • Decrease use of chemicals on farmland. • Enhanced pest and disease resistance. • Increased stress tolerance: drought, salinity, cold, etc. • Improved vitamin and micronutrient composition. • “Pharming” to make vaccines and therapeutics.
Dr. Doug Cook and Dr. Richard Michelmore are the lead researchers for CGF. They are specialists in Plant Genomics.
“Golden Rice” High Provitamin A (-carotene) rice is a major advance for plant biotechnology and focuses international attention on the metabolic engineering of output traits. Picture of Ingo Potrykus
Introduced enzyme (source) (daffodil) “Golden” rice Increased b-Carotene in Rice Grains Normal rice Normal rice Over 120 million children worldwide are deficient in vitamin A. Rice has been engineered to accumulate b-carotene, which is converted to vitamin A in the body. Incorporation of this trait into rice cultivars and widespread distribution could prevent 1 to 2 million deaths each year. (bacteria) (daffodil) Ye et al. (2000) Science 287: 303-305.
“Pharming” represents the Third Wave of Ag Biotech • Use of Genetically Modified Plants or Livestock as factories, rather than food • Plant derived medicines is not new! Examples: Taxol, digitalis, quinine, etc. • A plant/seed can be a high protein expression system (ex: Large Scale Biology;Ventria Biosciences; Epicyte; Monsanto; Prodigene, etc.) • Proteins can also be expressed in milk, semen or urine of cows, pigs, rabbits, etc. Example: spider silk (Nexia) • Significantly less costly than stainless steel tanks due to low initial capital investment and scalability
Nature BiotechnologyUncorking the biomanufacturing bottleneckAlan Dove, Aug 2002 p777-9 • “As biomanufacturing capacity becomes strained, several new methods for producing biologics are being investigated by biotechnology companies” • Spokesperson from GTC Biotherapeutics (formerly Genzyme Transgenics) in Framingham, MA estimates a 200% increase in manufacturing capacity for Mabs alone over the next 10 years. Their solution is milking transgenic animals (“mammary bioreactors”). PPL Therapeutics (UK) and BioProtein (France) have similar projects. Goats, sheep and cows are the most common mammals. • Transgenic chicken eggs may also be feasible, but the research is still in early stages. Companies: Origen Therapeutics (Burlingame, CA); Avigenics (Athens, Ga); TranXenGen (Shrewsbury, MA); GeneWorks (Ann Arbor, MI) and Vivalis (France).
Therapeutic Protein Production in Plants Important Considerations
Why Make Pharmaceuticals in Plants? • Supply the increasing demand for new biotech drugs (esp. antibodies) • 50 Mabs by 2008 • Significantly decrease unit costs • Improve patients’ access to biotech medicines • Plants are an efficient producer of proteins • Plants are scalable bioreactors • Plants provide cost advantages to mammalian cell culture systems • 3-5 times faster than mammalian systems • Plant cells are similar to human cells • Similar protein synthesis machinery • Read the same genetic code • Assemble, fold and secrete complex proteins
Significant cost associated with Mammalian Cell systems Transgenic Maize Provides High-Capacity, Low Cost Option for Large Scale Mab Manufacturing Transgenics provide a cost effective method at large scales Example: Maize 7 x 15,000L Tanks ~200-2000 Acres ~$250-450M Construction ~$80-120M Construction
Use Tobacco to make medicines RNA “Personalized medicine” www.lsbc.com Large Scale Biology (Aka Biosource) in Vacaville, Cais using the tobacco plant as a production system for human vaccines, cancer therapies (ex: lymphoma) and other pharmaceuticals. “Pharming in Plants” butno gene flow! (Transient expression of mRNA of vaccine gene)
Engineers are involved in Biotech too! • Bioprocessing • Nanotechnology • Biophotonics • Biosensors • Biomedical Engineering • Tissue engineering
NSF Center for Biophotonics Science and Technology (CBST) Motto: Shining Light on Life Lead campus Will be located across from the UCD Med Center ~100 participants $52M over 10 yrs Dennis Matthews, Director Emphasis on integrative research, education, and knowledge transfer in biophotonics ~20 research projects