350 likes | 490 Views
Bioinformatics Essentials Stephanie Tatem Murphy smurphy@bcc.ctc.edu. DNA. Model organisms. Protein. P N S A D A D N D F E D R L R A G L C D H D K E V Q G L Q V R C A V U EE H M H K K QQ E F E N I R L D A Q R L E FF A Y I F Q K E H M K R. A T G C A TTT C GG T
E N D
Bioinformatics Essentials Stephanie Tatem Murphy smurphy@bcc.ctc.edu
DNA Model organisms Protein PNSADADNDFEDRL RAGLCDHDKEVQGL QVRCAVUEEHMHK KQQEFENIRLDAQRL EFFAYIFQKEHMKR ATGCATTTCGGT TTACGCCATATA GCTCGGGAATCA TGCATCGATCGA GTAGCTAGCTAG
What is Bioinformatics? TGT AAT AGT TAT ATT TTC ATT ATA AAT TGT GTT TGT AGA CAT CAT AAA TTT AAA ACA TGG CTT TTT AAC CTG ATA AAT CCT ACG AAT ATT TGT AAT AGT TAT GTT ATT GCA GTA AGT ACC GTT TGT ATT ATA AAT TGT GTT CTG TGT AAT AGT TAT ATT TTC ATT ATA AAT TGT GTT TGT AGA CAT CAT AAA TTT AAA ACA TGG CTT TTT AAC CTG ATA AAT CCT ACG AAT ATT TGT AAT AGT TAT GTT ATT GCA GTA AGT ACC GTT TGT ATT ATA AAT TGT GTT CTG Which genes are turned off then on ? Courtesy of Dr. Young Moo Lee UC Davis
Human Genome Program, U.S. Department of Energy, Genomics and Its Impact on Medicine and Society: A 2001 Primer, 2001
FundamentalDogma DNA Although a few databases already exist to distribute molecular information, RNA Development ? Gene Expression? Proteins the post-genomic era will need many more to collect, manage, and publish the coming flood of new findings. Pathways Metabolism? Regulatory Pathways? Phenotypes Map Databases Neuroanatomy? Clinical Data ? PDB Populations Biodiversity? GenBank EMBL DDBJ Molecular Epidemiology? SwissPROT PIR Comparative Genomics? Bob Robbins http://www.esp.org/rjr/canberra.pdf
Gene a b c d e …ATGGCCCTGTGGATGCGCCTCCTGCCCCTG….. DNA base sequence recipe for amino acids Met: Ala: Leu: Trp: Met: Arg: Leu: Leu: Pro: Leu: Amino acid sequence = protein = trait Art by Yelena Ponirovskaya
The Biology Project University of Arizona http://www.biology.arizona.edu DNA acitivity – RFLP, Inheritancehttp://www.biology.arizona.edu/human_bio/activities/blackett/introduction.html DNA replication fork http://www.biology.arizona.edu/molecular_bio/problem_sets/nucleic_acids/03t.html DNA base pairing http://www.biology.arizona.edu/molecular_bio/problem_sets/nucleic_acids/08t.html DNA translation http://www.biology.arizona.edu/molecular_bio/problem_sets/nucleic_acids/10t.html The Genetic Code http://www.biology.arizona.edu/molecular_bio/problem_sets/nucleic_acids/12t.html http://www.biology.arizona.edu/molecular_bio/problem_sets/nucleic_acids/13t.html DNA transcription http://www.biology.arizona.edu/molecular_bio/problem_sets/nucleic_acids/15t.html
Bioinformatics – a Definition bio – informatics: bioinformatics is conceptualizing biology in terms of molecules and applying “informatics techniques” to understand and organise the information associated with these molecules, on a large scale. In short, bioinformatics is a management information system for molecular biology and has many practical applications. As submitted to the Oxford English Dictionary. What is Bioinformatics? N. M. Luscombe, et al. Yale University Method Inform Med 4/2001
BIOLOGY BIO INFORMATICS INFORMATION TECHNOLOGY COMPUTER SCIENCE Bioinformatics – a Definition The field of science in which biology, computer science, and information technology merge into a single discipline. NCBI, Aug 2001
What’s in a name? GenomeMapping ProteinAnalysisProteomics MultipleSequenceAlignment DatabaseHomologySearching 3DModeling Life ScienceInformatics HomologyModelingDocking SequenceAnalysis SampleRegistration &Tracking IntellectualPropertyAuditing CommonVisualInterfaces IntegratedDataRepositories
Bioinformatics Needs Multidisciplinary teams biologists, mathematicians, computer scientists, laboratory technicians Users and Developers to use / create scalable database infrastructure standards to control vocabulary and annotation new ways of visualizing, analyzing and searching data new ways of delivering information, tools and results Faster and larger computer systems
Demo Bioinformatics Company Onconomics Corporation http://www.bscs.org/onco/default.htm From nonprofit BSCS Biological Sciences Curriculum Study
Growth of Bioinformatics Computer Programming 50 yrs ago DNA & Protein Structure Personal Computers/ Internet 20 yrs ago PCR w.w.w. Last10 yrs Human Genome Project All fields use computers Now Biological (art, law, communication) Research Bioinformatics Computer Skills www.oreilly.com
Why informatics? Large size of data sets Allow students to ask questions of data Integrate current research into classroom http://www.ncbi.nlm.nih.gov/Genbank/genbankstats.html
>100,000 species are represented in GenBank all species 128,941 viruses 6,137 bacteria 31,262 archaea 2,100 eukaryota 87,147
The most sequenced organisms in GenBank Homo sapiens 10.7 billion bases Mus musculus6.5b Rattus norvegicus5.6b Danio rerio1.7b Zea mays 1.4b Oryza sativa0.8b Drosophila melanogaster0.7b Gallus gallus 0.5b Arabidopsis thaliana0.5b Table 2-2 Page 18 Updated 8-12-04 GenBank release 142.0
Online datasets for all the Life Sciences Environment and EcologyPopulation http://www.prb.orgWater http://www.waterontheweb.org/ http://www.neptune.washington.edu/ Geography http://nhd.usgs.gov/http://data.geocomm.com/ Chemistry Physics Biology Anatomy & Physiology Earth http://www.dlese.org/educators/usingdata.htmlAgriculture Nutrition Plant http://allometra.com/ath_fasta_mpss.shtml
Why use Bioinformatics? Data mining requires a testable hypothesis generated with regard to the function or structure of a gene or protein by identifying similar sequences in better characterized organisms. To help in uncovering phylogenetic relationships and evolutionary patterns. www.tigr.org
What is Bioinformatics? N. M. Luscombe, et al. Yale University Method Inform Med 4/2001
Biotechnology Did You or Will You Ever? Ride in a car?Genetically engineered micro-organisms will someday be used to extract oil from rocks. Micro-organisms that break down oil spills are already in use. Drink tap water?Genetically engineered micro-organisms will someday be used to attract and filter out harmful substances from drinking water. Have a dog or cat?Vaccines for a number of pet diseases such as rabies will be improved by genetic engineering. Wear brightly colored clothes?Many clothing dyes can be made less expensively with biotechnology, and will last longer. Take vitamins?Vitamins can be made more potent and less expensively with biotechnology. Go to the bathroom? Micro-organisms are already an important part of sewage treatment; genetic engineering will produce bacteria that are more efficient at breaking down wastes.
What Good is Recombinant DNA? People with diabetes need to take a drug called insulin. In the past, this drug was extracted and purified from ground-up animal glands. It takes several pounds of cow or pig glands to produce a fraction of an ounce of insulin. Today, the DNA with the instructions for making insulin can be spliced into a plasmid, And produced by bacteria? It’s faster, easier, and cheaper this way. http://www.chourave.ch/init/kid/cartoon-00.html There are still many technical problems to be solved. Not all gene splices work, and some that do may fail over time. There are also social and environmental concerns about biotechnology. Some people fear we will upset the balance of nature if “genetically engineered” organisms escape. Others fear that recombinant DNA will be used to influence human size, race, or intelligence. The best way for people to enjoy the benefits and avoid the problems is to stay informed and up to date about what’s happening in biotechnology.
How Do You Make Recombinant DNA? First, you need to isolate a specific bit of DNA with the instructions you want. To do this, you use restriction enzymes that break up DNA strands in specific places. After you have DNA fragments, you sort them by size, using a gel. DNA is loaded onto the top of the gel, and then electricity is passed through it. This causes the DNA pieces to migrate down, and the small pieces travel further than the large pieces. Next, you need to add the DNA fragment into a host. In most research, the host is a plasmid, a ring of DNA found in some bacteria. The host DNA has to be exposed to restriction enzymes to make split ends that will attach to the fragment. After you mix the new and host DNA fragments, you need to add enzymes that will glue them together.
How Do You Make Recombinant DNA? If you used a plasmid as a host, you need to put it back into a bacterium. When the bacterium replicates itself, it will copy the new DNA too. A small population of “gene-spliced” bacteria can develop into a large population in just a few days. http://www.gene.com/gene/research/ biotechnology
What is an Enzyme? Enzymes are molecules that speed up biological reactions. For example, the enzyme carbonic anhydrase enables red blood cells to pick up and dump carbon dioxide 1 million times faster than they could without it. Some characteristics of enzymes: Enzymes increase the rate of a chemical reaction. Enzymes are highly specific. Like a wrench that will only fit a 5/16-inch bolt, each enzyme generally works with only a particular kind of molecule. Enzymes don’t enter into the reaction themselves. They’re not physically changed as a result of the reaction. A single enzyme can act thousands of times. An enzyme increases the odds that two molecules will meet, so an enzyme is a “matchmaker”.
Why try to Design Better Enzymes? Enzymes are fragile…. they lose their shape (de-nature) if the temperature or acidity go up even a little. They also de-nature in alcohol or oils. This is a drag! If you’re adding an enzyme to a laundry detergent you’d like it to function in hot water, with bleach! As we understand more and more about DNA and how it is de-coded, we can re-write the instructions for making some enzymes. By altering their shapes, we may be able to make enzymes that are sturdier and able to function under harsher conditions. We may even be able to invent some completely new enzymes!
Examples of Enzymes Subtilisin–This enzyme is added to laundry detergent. It breaks down proteins (like yucky egg yolk stains or gross dried blood) into tiny fragments that can be rinsed away from the fibers of the cloth. Papain-This enzyme breaks up proteins, and is extracted from the papaya fruit. It’s now added to contact lens cleaner solution to help dissolve away gross crusty things from soft contact lenses. Ceredase-Several thousand people in the United States have Gaucher disease (low levels of a crucial enzyme that dissolves fatty deposits in the liver, spleen and bone marrow). They suffer from bone pain, fractures, swelling and bleeding. Ceredase is a variation of the enzyme, produced in the laboratory, which can be used to treat disease. Vianain-Originally derived from pineapples, this enzyme offers hope to burn victims. It helps prepare burned areas of skin grafts by safely dissolving damaged skin layers that would otherwise have to be removed surgically.
Journals & Books Public Library of Science - Open Access Journals http://www.plosbiology.org International Society for Computational Biology – Book Reviews http://www.iscb.org/bioinformaticsBooks.shtml Free Journals: Biotechniques http://www.BioTechniques.com Genomeweb http://www.genomeweb.com Books: The Cartoon Guide to Genetics, Larry Gonick & Mark Wheelis ISBN 0062730991 Harper 1983 Introduction to Bioinformatics, Arthur Lesk http://www.oup.com/uk/lesk/bioinf ISBN 0199251967 Oxford 2002 Fundamental Concepts of Bioinformatics, Dan Krane & Michael Raymer ISBN 0805346333 Benjamin Cummings 2003 Discovering Genomics, Proteomics, & Bioinformatics, A. Campbell & L. Heyer ISBN 0805347224 Benjamin Cummings 2002 Understanding Biotechnology, George Acquaah ISBN 0130945005 Pearson Prentice Hall 2004 Understanding Biotechnology, A. Borem, F. Santos, D. Bowen ISBN 0131010115 Pearson Prentice Hall 2003
Human Genome Project http://www.ornl.gov/sci/techresources/Human_Genome/publicat/primer2001/index.shtml Genomics and Its Impact on Science and Society: The Human Genome Project and Beyond U.S. Department of Energy Genome Programs http://doegenomes.org
www.ncbi.nlm.nih.gov National Center for Biotechnology Information
A user’s guide to human genome Nature Genetics www.nature.com/ng/ vol 32, pg 1-79, 01 Sep 2002 • Introduction: putting it together • Question 8: How can one find all the members of a human gene family? • Question 12: How does a user find characterized mouse mutants corresponding to human genes? • Web resources: Internet resources featured in this guide
Get Schooled for Bioinformatics • Biology • Know basics & Have sense of biological experimentation • Computer Science • Programming C, C++, Perl, JAVA, SAS, CGI • Database construction UNIX, LINUX • Algorithm design • Math/Statistics • Probability, Experimental design • Ethics • “Core Bioinformatics” • LIMS • EST clustering • Sequence analysis & annotation
FundamentalDogma DNA GenBank EMBL DDBJ Map Databases RNA Development ? Gene Expression? Proteins SwissPROT PIR PDB Circuits Metabolism? Regulatory Pathways? Phenotypes Neuroanatomy? Clinical Data ? Populations Biodiversity? Molecular Epidemiology? Comparative Genomics? Although a few databases already exist to distribute molecular information, the post-genomic era will need many more to collect, manage, and publish the coming flood of new findings. Biological Research = To enable the discovery of new biological insights as well as create a global perspective from which unifying principles in biology can be discerned. NCBI, Aug 2001
Ultra – Conserved element • Only 6 SNP’s • mouse, rat, human • TGATCCCGGACTCTATGAATTATTGATGAGATATGAGCGTTGATTTCCCCTTTCAG • GATGCAAACTCCATTATATTGTTAAAATGGCGATTTAATCGTTGAGAATAGCTTTG • GTGTGGGTTTTTTCCCCCAACTCATTTGCGCCTCCTTCCTTTTCATTTAACTCTCT • TAATTAAATCCTTTAACAGATTTTAATCACTTTTTGGAG