390 likes | 433 Views
A short introduction to biology. Life. Two categories: Prokaryotes (e.g. bacteria) Unicellular No nucleus Eukaryotes (e.g. fungi, plant, animal) Unicellular or multicellular Has nucleus. Prokaryote vs Eukaryote. Eukaryote has many membrane-bounded compartment inside the cell
E N D
Life • Two categories: • Prokaryotes (e.g. bacteria) • Unicellular • No nucleus • Eukaryotes (e.g. fungi, plant, animal) • Unicellular or multicellular • Has nucleus
Prokaryote vs Eukaryote • Eukaryote has many membrane-bounded compartment inside the cell • Different biological processes occur at different cellular location
Organ Organism, Organ, Cell Organism
Chemical contents of cell • Water • Macromolecules (polymers) - “strings” made by linking monomers from a specified set (alphabet) • Protein • DNA • RNA • … • Small molecules • Sugar • Ions (Na+, Ka+, Ca2+, Cl- ,…) • Hormone • …
DNA • DNA: forms the genetic material of all living organisms • Can be replicated and passed to descendents • Contains information to produce proteins • To computer scientists, DNA is a string made from alphabet {A, C, G, T} • e.g. ACAGAACGTAGTGCCGTGAGCG • Each letter is a nucleotide • Length varies from hundreds to billions
RNA • Historically thought to be information carrier only • DNA => RNA => Protein • New roles have been found for them • To computer scientists, RNA is a string made from alphabet {A, C, G, U} • e.g. ACAGAACGUAGUGCCGUGAGCG • Each letter is a nucleotide • Length varies from tens to thousands
Protein • Protein: the actual “worker” for almost all processes in the cell • Enzymes: speed up reactions • Signaling: information transduction • Structural support • Production of other macromolecules • Transport • To computer scientists, protein is a string made from 20 kinds of characters • E.g. MGDVEKGKKIFIMKCSQCHTVEKGGKHKTGP • Each letter is called an amino acid • Length varies from tens to thousands
DNA/RNA zoom-in • Commonly referred to as Nucleic Acid • DNA: Deoxyribonucleic acid • RNA: Ribonucleic acid • Found mainly in the nucleus of a cell (hence “nucleic”) • Contain phosphoric acid as a component (hence “acid”) • They are made up of a string of nucleotides
Nucleotides • A nucleotide has 3 components • Sugar ring (ribose in RNA, deoxyribose in DNA) • Phosphoric acid • Nitrogen base • Adenine (A) • Guanine (G) • Cytosine (C) • Thymine (T) in DNA and Uracil (U) in RNA
A G C G A C T G 5’ Free phosphate 5 prime 3 prime 5’-AGCGACTG-3’ AGCGACTG DNA Often recorded from 5’ to 3’, which is the direction of many biological processes. e.g. DNA replication, transcription, etc. Base 5 Phosphate Sugar 4 1 2 3 3’
A G U G A C U G 5’ Free phosphate 5 prime 3 prime 5’-AGUGACUG-3’ AGUGACUG RNA Often recorded from 5’ to 3’, which is the direction of many biological processes. e.g. translation. Base 5 Phosphate Sugar 4 1 2 3 3’
A T G C C G G C A T C G A T G C 3’ 5’ Base-pair: A = T G = C Forward (+) strand 5’-AGCGACTG-3’ 3’-TCGCTGAC-5’ Backward (-) strand AGCGACTG TCGCTGAC One strand is said to be reverse- complementary to the other 3’ 5’ DNA usually exists in pairs.
DNA double helix G-C pair is stronger than A-T pair
RNA • RNAs are normally single-stranded • Form complex structure by self-base-pairing • A=U, C=G • Can also form RNA-DNA and RNA-RNA double strands. • A=T/U, C=G
Carboxyl group Amino group Protein zoom-in • Protein is the actual “worker” for almost all processes in the cell • A string built from 20 kinds of chars • E.g. MGDVEKGKKIFIMKCSQCHTVEKGGKH • Each letter is called an amino acid R | H2N--C--COOH | H Side chain Generic chemical form of amino acid
Units of Protein: Amino acid • 20 amino acids, only differ at side chains • Each can be expressed by three letters • Or a single letter: A-Y, except B, J, O, U, X, Z • Alanine = Ala = A • Histidine = His = H
Amino acids => peptide R R | | H2N--C--COOH H2N--C--COOH | | H H R R | | H2N--C--CO--NH--C--COOH | | H H Peptide bond
R R R R R R … H2N COOH C-terminal N-terminal Protein • Has orientations • Usually recorded from N-terminal to C-terminal • Peptide vs protein: basically the same thing • Conventions • Peptide is shorter (< 50aa), while protein is longer • Peptide refers to the sequence, while protein has 2D/3D structure
Genome and chromosome • Genome: the complete DNA sequences in the cell of an organism • May contain one (in most prokaryotes) or more (in eukaryotes) chromosomes • Chromosome: a single large DNA molecule in the cell • May be circular or linear • Contain genes as well as “junk DNAs” • Highly packed!
Formation of chromosome 50,000 times shorter than extended DNA The total length of DNA present in one adult human is the equivalent of nearly 70 round trips from the earth to the sun
Gene • Gene: unit of heredity in living organisms • A segment of DNA with information to make a protein or a functional RNA
Human genome • 46 chromosomes: 22 pairs + X + Y 1 from mother, 1 from father • Female: X + X • Male: X + Y
DNA Replication • The process of copying a double-stranded DNA molecule • Semi-conservative 5’-ACATGATAA-3’ 3’-TGTACTATT-5’ 5’-ACATGATAA-3’ 5’-ACATGATAA-3’ 3’-TGTACTATT-5’ 3’-TGTACTATT-5’
p p p Nucleotide triphosphate (dNTP) • Mutation: changes in DNA base-pairs • Proofreading and error-correcting mechanisms exist to ensure extremely high fidelity
Transcription • The process that a DNA sequence is copied to produce a complementary RNA • Called message RNA (mRNA) if the RNA carries instruction on how to make a protein • Called non-coding RNA if the RNA does not carry instruction on how to make a protein • Only consider mRNA for now • Similar to replication, but • Only one strand is copied
Transcription (where genetic information is stored) • DNA-RNA pair: • A=U, C=G • T=A, G=C (for making mRNA) Coding strand: 5’-ACGTAGACGTATAGAGCCTAG-3’ Template strand: 3’-TGCATCTGCATATCTCGGATC-5’ mRNA: 5’-ACGUAGACGUAUAGAGCCUAG-3’ Coding strand and mRNA have the same sequence, except that T’s in DNA are replaced by U’s in mRNA.
Translation • The process of making proteins from mRNA • A gene uniquely encodes a protein • There are four bases in DNA (A, C, G, T), and four in RNA (A, C, G, U), but 20 amino acids in protein • How many nucleotides are required to encode an amino acid in order to ensure correct translation? • 4^1 = 4 • 4^2 = 16 • 4^3 = 64 • The actual genetic code used by the cell is a triplet. • Each triplet is called a codon
The Genetic Code Third letter
Translation • The sequence of codons is translated to a sequence of amino acids • Gene: -GCT TGT TTA CGA ATT- • mRNA: -GCUUGUUUACGAAUU - • Peptide: - Ala - Cys - Leu - Arg - Ile – • Start codon: AUG • Also code Met • Stop codon: UGA, UAA, UAG
Translation • Transfer RNA (tRNA) – a different type of RNA. • Freely float in the cell. • Every amino acid has its own type of tRNA that binds to it alone. • Anti-codon – codon binding crucial. tRNA-Pro Anti-codon Nascent peptide tRNA-Leu mRNA
Transcriptional regulation • Will talk more in later lectures • RNA polymerase binds to certain location on promoter to initiate transcription • Transcription factor binds to specific sequences on the promoter to regulate the transcription • Recruit RNA polymerase: induce • Block RNA polymerase: repress • Multiple transcription factors may coordinate Transcription factor RNA Polymerase Transcription starting site gene promoter
Splicing Transcription starting site • Pre-mRNA needs to be “edited” to form mature mRNA • Will talk more in later lectures. gene promoter transcription Pre-mRNA intron intron Pre-mRNA exon exon exon 3’ UTR 5’ UTR Splicing Mature mRNA (mRNA) Open reading frame (ORF) Start codon Stop codon
Summary • DNA: a string made from {A, C, G, T} • Forms the basis of genes • Has 5’ and 3’ • Normally forms double-strand by reverse complement • RNA: a string made from {A, C, G, U} • mRNA: messenger RNA • tRNA: transfer RNA • Other types of RNA: rRNA, miRNA, etc. • Has 5’ and 3’ • Normally single-stranded. But can form secondary structure • Protein: made from 20 kinds of amino acids • Actual worker in the cell • Has N-terminal and C-terminal • Sequence uniquely determined by its gene via the use of codons • Sequence determines structure, structure determines function • Central dogma: DNA transcribes to RNA, RNA translates to Protein • Both steps are regulated