230 likes | 443 Views
Topics. The topics: basic concepts of molecular biology more on Perl overview of the field biological databases and database searching sequence alignments phylogenetic trees protein structure prediction microarray data analysis. The Human Genome Project.
E N D
Topics The topics: • basic concepts of molecular biology • more on Perl • overview of the field • biological databases and database searching • sequence alignments • phylogenetic trees • protein structure prediction • microarray data analysis
The Human Genome Project The human genome sequence is complete - almost - approximately 3 billion base pairs. Some of these slides are adapted from Lecture Notes of Stuart M. Brown at NYU
How does the human genome stack up? U.S. Department of Energy Genome Programs, Genomics and Its Impact on Science and Society, 2003
The Path Forward • How does DNA impact health? • Identify and understand the difference in DNA sequence (A,T,C,G) among human populations • What do all the genes do? • Discover the functions of human genes by experimentation and by finding genes with similar funcs in the model organisms • What are the functions of nongene areas? • Identify important elements in the nongene regions of DNA • How does info in the genome enable life? • Explore life at the ultimate level of the whole organism instead of single genes/proteins. U.S. Department of Energy, 2005
Diverse applications • Medicine – customized treatments, … • Microbes for energy and the environment – generate clean energy source, clean up toxic wastes,… • Bioanthropology – human lineage • Agriculture, livestock breeding, Bioprocessing – crops&animals more resistant to diseases, efficient industrial processes,… • DNA identification – implicate people accused of crimes, identify contaminants in air, water, … U.S. Department of Energy, 2005
Genomics: Journey to the Center of Biology Without doubt, the greatest achievement in biology over the past millennium has been the elucidation of the mechanism of heredity. The instructions for assembling every organism on the planet are all specified in DNA sequences that can be translated into digital information and stored in a computer for analysis. As a consequence of this revolution, biology in the 21st century is rapidly becoming an information science. Powerful new types of bioinformatics will clearly be required to assimilate and interpret the data that will issue from various types of genomics research. Eric Lander & Robert Weinberg, Science, 2000
Nucleic Acid Sequence Databases • the principal nucleic acid sequence databases are GeneBank, EMBL and DDBJ, which each collect a portion of the total sequence data reported world-wide, and exchange new and updated entries on a daily basis
GenBank • Once upon a time, GenBank sent out sequence updates on CD-ROM disks a few times per year.
Specialised Genomic Resources • In addition to the comprehensive DNA sequence DBs, there is a variety of more specialised genomic resources. • These so called boutique DBs bring focus to species-specific genomics and to particular sequencing techniques.
Levels of protein sequence and structural organisation: primary The primary structure of a protein is its amino acid sequence The second structure of a protein corresponds to regions of local regularity (e.g., α-helices and β-strands). secondary The tertiary structure of a protein arises from the packing of its secondary structure elements, which may form discrete domains within a fold. tertiary Protein Information Resources
Primary Protein Databases • The primary structure of a protein is its amino acid sequence. These are stored in primary databases as linear alphabets that denote the constituent residues.
Structure Classification DBs • Contain 3D structures available from crystallographic and spectroscopic studies
Databases concerning Mutations • dbSNPhttp://www.ncbi.nlm.nih.gov/SNP • HGBASE (Human Genome Variation Database) http://hgbase.cgr.ki.se • The SNP Consortium (TSC)http://snp.cshl.org
LiteratureDatabases • PubMedhttp://www.ncbi.nlm.nih.gov/entrez/query • Bioinformatics Onlinehttp://www.bioinformatics.oupjournals.org • Naturehttp://www.nature.com • Sciencehttp://www.sciencemag.org
Systems Biology • Integrate different levels of information to understand how biological systems function • Use computational and mathematical models to analyze, model and simulate cellular networks, interactions and pathways.
Microarray • DNA microarray is a new technology to measure the level of the mRNA gene products of a living cell.
* * * * * * Affymetrix GeneChip® Probe Arrays Hybridized Probe Cell GeneChip Probe Array Single stranded, fluorescently labeled cRNA target Oligonucleotide probe 24~50µm 1.28cm Each probe cell or feature contains millions of copies of a specific oligonucleotide probe Image of Hybridized Probe Array BGT108_DukeUniv
Bioinformatics Tools • Database & searching • Computational algorithms • Alignment • Similarity • Clustering • Pattern Searching • Structure predictions • Statistical methods • Data visualization
Bioinformatics Bioinformaticsis the research, development, or application of computational tools and approaches for expanding the use of biological, medical, behavioral or health data, including those to acquire, store, organize, archive, analyze, or visualize such data; Computational biologyis the development and application of data-analytical and theoretical methods, mathematical modeling and computational simulation techniques to the study of biological, behavioral, and social systems.