670 likes | 710 Views
Machine Learning & Bioinformatics. Tien-Hao Chang (Darby Chang). Molecular biology. Nucleic acid DNA RNA Central dogma Transcription Translation. Protein Amino acid Primary structure Secondary structure Tertiary structure. Nucleic acid.
E N D
Machine Learning & Bioinformatics Tien-Hao Chang (Darby Chang) Machine Learning & Bioinformatics
Molecular biology • Nucleic acid • DNA • RNA • Central dogma • Transcription • Translation • Protein • Amino acid • Primary structure • Secondary structure • Tertiary structure Machine Learning & Bioinformatics
Nucleic acid • A nucleic acid is a macromolecule composed of chains of monomeric nucleotide • In biochemistry these molecules carry genetic information or form structures within cells • The most common nucleic acids are deoxyribonucleic acid (DNA) and ribonucleic acid (RNA) Machine Learning & Bioinformatics
Nucleic acid componentsSugar http://www.mun.ca/biology/scarr/Fg10_09b_revised.gif Machine Learning & Bioinformatics
Nucleic acid componentsBase • Purine • Adenine (A) and guanine (G) • Pyrimidine • Thymine (T), cytosine (C) • Uracil (U, only in RNA) Machine Learning & Bioinformatics
http://fig.cox.miami.edu/~cmallery/150/chemistry/sf3x14a.jpg
DNA • Chemically, DNA is a long polymer of simple units called nucleotides, with a backbone made of sugars and phosphate groups joined by ester bonds • Attached to each sugar is oneof four types of moleculescalled bases • It is the sequence of these fourbases along the backbone thatencodes information http://upload.wikimedia.org/wikipedia/commons/8/87/DNA_orbit_animated_small.gif Machine Learning & Bioinformatics
DNABase pairing • Each type of base on one strand forms a bond with just one type of base on the other strand • Here, purines form hydrogen bonds to pyrimidines, with A bonding only to T, and C bonding only to G • DNA sequence • 5’CpGpCpApApTpT3’TpTpApApCpGpC • CGCGAATT Machine Learning & Bioinformatics
Double helix http://www.coe.drexel.edu/ret/personalsites/2005/dayal/curriculum1_files/image001.jpg
Hydrogen bond • A hydrogen bond exists between an electronegative atom and a hydrogen atom bonded to another electronegative atom • This type of force always involves a hydrogen atom and the energy of this attraction is close to that of weak covalent bonds (155 kJ/mol), thus the name – Hydrogen Bonding • Biological functions • DNA/RNA base paring • protein secondary/tertiary structure formation • some properties of water molecule • antibody-antigen (and other protein-protein) binding Machine Learning & Bioinformatics
Hydrogen bond is resulted from electronegativity http://upload.wikimedia.org/wikipedia/commons/4/43/Liquid_water_hydrogen_bond.png
Grooves http://courses.biology.utah.edu/horvath/biol.3525/1_DNA/Fig2/marty_1.jpg
DNA structure http://www.youtube.com/watch?v=qy8dk5iS1f0&NR=1 Machine Learning & Bioinformatics
About DNA Machine Learning & Bioinformatics
Central dogma http://fig.cox.miami.edu/~cmallery/255/255hist/mcb4.1.dogma.jpg
Central dogma • The process by witch information is extracted from the nucleotide sequence of a gene and then used to make a protein is essentially the same for all living things on Earthand is described by the grandlynamed central dogma ofmolecular biology • Information in cells passes fromDNA to RNA to proteins http://upload.wikimedia.org/wikipedia/commons/3/3a/Crick's_1958_central_dogma.svg Machine Learning & Bioinformatics
RNA • Information stored from DNA is used to make a more transient, single-stranded polynucleotide called RNA (Ribonucleic Acid) • RNA is very similar to DNA, but differs in a few important structural details • in the cell RNA is usually single stranded, while DNA is usually double stranded • RNA nucleotides contain ribose while DNA contains deoxyribose (a type of ribose that lacks one oxygen atom) • in RNA the nucleotide uracil substitutes for thymine, which is present in DNA Machine Learning & Bioinformatics
Central dogmaTranscription • Transcription is the synthesis of RNA under the direction of DNA • Both nucleic acid sequences use the same language, and the information is simply transcribed, or copied • DNA sequence is copied by RNA polymerase to produce a complementary nucleotide RNA strand, called messenger RNA (mRNA) Machine Learning & Bioinformatics
DNA transcription http://www.youtube.com/watch?v=vJSmZ3DsntU Machine Learning & Bioinformatics
Transcription detail http://www-class.unl.edu/biochem/gp2/m_biology/animation/m_animations/gene2.swf Machine Learning & Bioinformatics
RNAVarious types • mRNA • messenger RNA (mRNA) is the RNA that carries information from DNA to the ribosome • the coding sequence of the mRNA determines the amino acid sequence in the protein that is produced • Non-coding RNA Machine Learning & Bioinformatics
Various RNA typesNon-coding RNA • Many RNAs do not code for protein • These ncRNAs encode in specific genes (RNA genes) or mRNA introns • The most common ncRNAs are transfer RNA (tRNA) and ribosomal RNA (rRNA) • Other ncRNAs such as microRNA (miRNA) involve in post-transcriptional gene regulation Machine Learning & Bioinformatics
http://eurheartj.oxfordjournals.org/content/vol0/issue2010/images/large/ehp57301.jpeghttp://eurheartj.oxfordjournals.org/content/vol0/issue2010/images/large/ehp57301.jpeg
Central dogmaTranslation • Translation is the second stage of protein biosynthesis • Translation occurs in the cytoplasm where the ribosomes are located • In translation, mRNA is decoded to produce a specific polypeptide according to the rules specified by the genetic code Machine Learning & Bioinformatics
From RNA to protein synthesis http://www.youtube.com/watch?v=NJxobgkPEAo Machine Learning & Bioinformatics
Protein translation http://www.youtube.com/watch?v=nl8pSlonmA0 Machine Learning & Bioinformatics
About central dogma Machine Learning & Bioinformatics
Protein Machine Learning & Bioinformatics
Protein • Proteins are large organic compounds made of amino acids arranged in a linear chain and joined together by peptide bonds between the carboxyl and amino groups of adjacent amino acid residues • Proteins can also work together to achieve a particular function, and they often associate to form stable complexes Machine Learning & Bioinformatics
ProteinAmino acid • In chemistry, an amino acid is a molecule that contains both amine and carboxyl functional groups • In biochemistry, this term refers to alpha-amino acids with the general formula H2NCHRCOOH, where R is an organic substituent Machine Learning & Bioinformatics
http://upload.wikimedia.org/wikipedia/commons/thumb/c/ce/AminoAcidball.svg/702px-AminoAcidball.svg.pnghttp://upload.wikimedia.org/wikipedia/commons/thumb/c/ce/AminoAcidball.svg/702px-AminoAcidball.svg.png
Amino acidVarious side chains • The various alpha amino acids differ in which side chain (R group) is attached to their alpha carbon • They can vary in size from just a hydrogen atom in glycine through a methyl group in alanine to a large heterocyclic group in tryptophan Machine Learning & Bioinformatics
http://upload.wikimedia.org/wikipedia/commons/thumb/3/37/Aa.svg/2000px-Aa.svg.pnghttp://upload.wikimedia.org/wikipedia/commons/thumb/3/37/Aa.svg/2000px-Aa.svg.png
http://juang.bst.ntu.edu.tw/BC2008/images/Amino%281%29%202007/A1-7.JPGhttp://juang.bst.ntu.edu.tw/BC2008/images/Amino%281%29%202007/A1-7.JPG
http://juang.bst.ntu.edu.tw/BC2008/images/Amino%281%29%202007/A1-9.JPGhttp://juang.bst.ntu.edu.tw/BC2008/images/Amino%281%29%202007/A1-9.JPG
http://www.russell.embl-heidelberg.de/aas/other_images/lb3.gifhttp://www.russell.embl-heidelberg.de/aas/other_images/lb3.gif Machine Learning & Bioinformatics
Amino acidThe building blocks of proteins • Amino acids combine in a condensation reaction and the new “amino acid residue” are held together by peptide bonds • Proteins are defined by their unique sequence of residues (primary structure) • As the letters form various words, amino acids form a vast variety of sequences/proteins Machine Learning & Bioinformatics
http://upload.wikimedia.org/wikipedia/commons/thumb/6/6d/Peptidformationball.svg/2000px-Peptidformationball.svg.pnghttp://upload.wikimedia.org/wikipedia/commons/thumb/6/6d/Peptidformationball.svg/2000px-Peptidformationball.svg.png
http://juang.bst.ntu.edu.tw/BC2008/images/Amino(1)%202007/A1-11.JPGhttp://juang.bst.ntu.edu.tw/BC2008/images/Amino(1)%202007/A1-11.JPG
http://juang.bst.ntu.edu.tw/BC2008/images/Amino(1)%202007/A1-13.JPGhttp://juang.bst.ntu.edu.tw/BC2008/images/Amino(1)%202007/A1-13.JPG
ProteinAfter knowing amino acids • Amino acids form short polymer chains called peptides or longer chains called either polypeptides or proteins • The process of such formation from an mRNA template (obeying genetic code) is known as translation, which is part of protein biosynthesis Machine Learning & Bioinformatics
Protein structure hierarchy Machine Learning & Bioinformatics
http://cropandsoil.oregonstate.edu/classes/css430/lecture%209-07/figure-09-03.JPGhttp://cropandsoil.oregonstate.edu/classes/css430/lecture%209-07/figure-09-03.JPG
http://juang.bst.ntu.edu.tw/BC2008/images/Protein(1)%202007/P1-4.JPGhttp://juang.bst.ntu.edu.tw/BC2008/images/Protein(1)%202007/P1-4.JPG
http://juang.bst.ntu.edu.tw/BC2008/images/Protein(1)%202007/P1-8.JPGhttp://juang.bst.ntu.edu.tw/BC2008/images/Protein(1)%202007/P1-8.JPG