530 likes | 706 Views
Last updated 11/03/10 1:00 AM. http://www.helicosbio.com/Technology/TrueSingleMoleculeSequencing/tabid/64/Default.aspx. Like Illumina, but immobilized templates are SS DNA molecules (~200 nt) Each cycle adds one base,records, and then cleaves the fluorescent group
E N D
Last updated 11/03/10 1:00 AM http://www.helicosbio.com/Technology/TrueSingleMoleculeSequencing/tabid/64/Default.aspx Like Illumina, but immobilized templates are SS DNA molecules (~200 nt) Each cycle adds one base,records, and then cleaves the fluorescent group and washes it away. Several billion single molecule “spots” per slide.
Helicos paired end sequencing 1 2 3 4 5 6 7
Helicos virtual terminator Inhibits DNA Pol once incorporated (so 1 base at a time) Cleavable via the S-S bond (reduce it) Free 3’ OH never blocked dU-3’P,5’P dUTP
Quantification of the yeast transcriptome by single-molecule sequencing Lipson et al. NATURE BIOTECHNOLOGY 27: 652, 2009 Tail 3’ end with A via terminal transferase, adding dT to terminate Make cDNA via oligo dT Add Cy5-labeled special nucleotide tri-Ps + DNA Pol. Wash. Record image. Hybridize to surface-linked oligo dTs Add next Cy5-labeled special nucleotide tri-Ps + DNA Pol. Wash. Record image. Cleave dye from incorporated nt. Wash.
smsDGE = digital gene expression via Helicos sequencing and counting MA = microarray data
QPCR (Quantitative PCR) Q-RT-PCR (Quantitative reverse transcription-PCR)
Distribution of yeast transcripts mRNA Est. copies/cell: 0.5 5 50 500 TSS position relative to ATG TSS = transcription start site t.p.m. = transcripts per million
ZMW = zero mode waveguide 10 zl volume seen (1 zeptoliter = 10-21 L.) Add template and special phospho nucleotides. One DNA Pol molecule per ZMW
Cleaved when incorporated Other technologies Phospho-linked fluorescently-labeled nucleoside triphosphates
Emission Excitation
Use a circular template to get redundant reads and so more accuracy.
Pacific Biosciences 3000 ZMWs, but density expected to climb Each ZMW capable of 400,000 bases per day 6 days X 3000 X 400,000 = 7.2 x 109 (at1X coverage) Predict by 2014 will sequence a human genome in 15 min.Predict by 2014 will sequence a human genome for low hundreds of $ Exact number of ZMWs per chip = “thousands,” perhaps 3000 as of 2010
Applications of “deep” sequencing Also: definition and discovery of cis-acting regulatory motifs in DNA and RNA
Detection of methylated C (~all in CpG dinucleotides) cytosine ----CmpG--- > ----CpG-- > ----CmpG--- > < ---G p Cm--- DS DNA Na bisulfite Heat Na bisulfite Heat ----CmpG--- > ----UpG-- > PCR ----TpG-- > ----CpG-- > <--GpC--- <--ApC--- All NON-methylated Cs changed to T uracil
Definition and discovery of cis-acting regulatory motifs in DNA and RNA
Definition of sequences (6-mers) that affect pre-mRNA splicing (Ke and Chasin, unpublished). Order an equal mixture of all 4 bases at these positions
Rank 6-mer score (~ -1 to +1) 1 AGAAGA 1.0339 2 GAAGAT 0.9918 3 GACGTC 0.9836 4 GAAGAC 0.9642 5 TCGTCG 0.9517 6 TGAAGA 0.9434 7 CAAGAA 0.9219 8 CGTCGA 0.8853 :: 4086 TAGATA -0.8609 4087 AGGTAG -0.8713 4088 CGTCGC 0.8850 4089 CTTAAA -0.8786 4090 CCTTTA -0.8812 4091 GCAAGA 0.8911 4092 TAGTTA -0.8933 4093 TCGCCG 0.9113 4094 CCAGCA -0.8942 4093 CTAGTA -0.9251 4094 TAGTAG -0.9383 4095 TAGGTA -0.9965 4096 CTTTTA -1.0610 Best exonic splicing enhancers Worst exonic splicing enhancers, = best exonic splicing silencers
Constitutive exons Alternativexons Pseudo exons Composite exon (from ~100,000)
23 Sequence of 36 Quality code CGCACTGTGCTGGAGCTCCCGGGGTTAACTCTAGAA abU^Vaa`a\aaa]aWaTNZ`aa`Q][TE[UaP_U] TACACTGTGCTGGAGCTCCCAACGGCAACTCTAGAA a`P^Wa`[`Wa^`X_X_XWVa^NSP]_]S^X_T\X^ CGCACTGTGCTGGAGCTCCCATGGAGAACTCTAGAA aTa`^b``baaaa^aab^YaTQLOHIa`^a``TX]] TACACTGTGCTGGAGCTCCCCTCCCAAACTCTAGAA I_`aaaa`aaaaaaa_a_^[KZIGIGZ`U`\^P^^` CGCACTGTGCTGGAGCTCCCAATAGTAACTTTAGAA aY_\abb[T\abaaa`a`bZ[HXXIZa_`_LGMS[` TATACTGTGCTGGAGCTCCCGACGTAAACTCTAGAA aba]^aa_a]`aa]_]`XWSMFGGIPX[P]X`V_Y^ TACACTGTGCTGGAGCTCCCTGGTAAAACTCTAGAA a_^a^aa`aYaaa_aY`Y_^[I]VY\`]V]R\W]VV TACACTGTGCTGGAGCTCCCAATAAAAACTCTAGAA XZababa`aZaaaaaYaYXX`baa``\\TaUa\aW` Variable region Constant regions (peculiar to our expt.) 2 nt barcode (TA or CG) Experiment: 1 1 1 2 2 1+2 2 2 1 2
OUTLINE OF NEXT LECTURE TOPICSExpression and manipulation of transgenes in the laboratory 24 • In vitro mutagenesis to isolate variants of your protein/gene with desirable properties • Single base mutations • Deletions • Overlap extension PCR • Cassette mutagenesis • To study the protein: Express your transgene • Usually in E. coli, for speed, economy • Expression in eukaryotic hosts • Drive it with a promoter/enhancer • Purify it via a protein tag • Cleave it to get the pure protein • Explore protein-protein interaction • Co-immunoprecipitation (co-IP) from extracts • 2-hybrid formation • surface plasmon resonance • FRET (Fluorescence resonance energy transfer) • Complementation readout
RS1 25 RS2 RS1 RS2 Site-directed mutagenesis by overlap extension PCR PCR fragment subsequent cloning in a plasmid Ligate into similarly cut vector 1 2 Cut with RE 1 and 2
26 Cassette mutagenesis = random mutagenesis but in a limited region: 1) by error-prone PCR Original sequence coding for, e.g., a transcripiton enhancer region ---------------------------------------------------------------------------------------------------------------------- PCR fragment with high Taqpolymerase and Mn+2 instead of Mg+2 errors ------*--------*--*-**---------------*-----------*--*-------*------------------------*-*-*------------*------------*-- Cut in primer sites and clone upstream of a reporter protein sequence. Pick colonies Analyze phenotypes Sequence
27 Cassette mutagenesis = random mutagenesis but in a limited region: 2) by “doped” synthesisTarget = e.g., an enhancer element ---------------------------------------------------------------------------------------------------------------------- Original enhancer sequence -*------------------------*-*-*------------*------------*-- ------*--------*--*-**---------------*-----------*--*------ Buy 2 doped oligos; anneal OK for up to ~80 nt. Clone upstream of a reporter. Doping = e.g., 90% G, 3.3% A, 3.3% C, 3.3% T at each position Pick colonies Analyze phenotypes Sequence
29 E. coli as a host • PROs:Easy, flexible, high tech, fast, cheap; but problems • CONs • Folding (can misfold) • Sorting -> can form inclusion bodies • Purification -- endotoxins • Modification -- not done (glycosylation, phosphorylation, etc. ) • Modifications: • Glycoproteins • Acylation: acetylation, myristoylation • Methylation (arg, lys) • Phosphorylation (ser, thr, tyr) • Sulfation (tyr) • Prenylation (farnesyl, geranylgeranyl on cys) • Vitamin C-Dependent Modifications (hydroxylation of proline and lysine) • Vitamin K-Dependent Modifications (gamma carboxylation of glu) • Selenoproteins (seleno-cys tRNA at UGA stop)
30 Some alternative hosts • Yeasts (Saccharomyces , Pichia) • Insect cells with baculovirus vectors • Mammalian cells in culture (later) • Whole organisms (mice, goats, corn) (not discussed) • In vitro (cell-free), for analysis only(good for radiolabeled proteins)
31 GAPDterm LEU2 GAPDprom Ampr oriE Yeast Expression Vector (example) Saccharomyces cerevisiae(baker’s yeast) 2 mu seq: yeast ori oriE = bacterial ori Ampr = bacterial selection LEU2, e.g. = Leu biosynthesisfor yeast selection 2 micron plasmid Complementation of an auxotrophy can be used instead of drug-resistance Your favorite gene(Yfg) Auxotrophy = state of a mutant in a biosynthetic pathway resulting in a requirement for a nutrient GAPD = the enzyme glyceraldehyde-3 phosphate dehydrogenase
Vector DNA t p gfY Genomic DNA Genomic DNA HIS4 mutation- Yeast - genomic integration via homologous recombination HIS4 t p Yfg FunctionalHIS4 gene DefectiveHIS4 gene
HIS4 Vector DNA AOX1t Yfg AOX1p 3’AOX1 Genomic DNA AOX1 gene (~ 30% of total protein) Genomic DNA Yfg 3’AOX1 AOX1p AOX1t HIS4 Double recombination Yeast (integration in Pichia pastoris) P. pastoris-tight control-methanol induced (AOX1)-large scale production (gram quantities) Alcohol oxidase gene
PROTEIN-PROTEIN INTERACTIONS Yeast 2-hybrid system to discover proteins that interact with each other Or to test for interaction based on a hypothesis for a specific protein. (bait) ? Y = e.g., a candidate protein being tested for possible interaction with X Or: Y = e.g., a cDNA library used to discover a protein that interacts with X ? (prey) BD = (DNA) binding domain AD = activation domain http://www.mblab.gla.ac.uk/~maria/Y2H/Y2H.html
No interaction between X and Y: no reporter expression Yes, interaction between X and Y: reporter protein is expressed: Y = e.g., a cDNA library used to discover a protein that interacts with X Recover the Y sequence from reporter+ colonies by PCR to idenify protein Y
Fusion library Bait protein is the known target proteinfor whom partners are sought =“prey” and/or Two different assays help, as there are often many false positives. BD= DNA binding domain; TA = transactiavting domain http://www.mblab.gla.ac.uk/~maria/Y2H/Y2H.html
3-HYBRID: select for proteins domains that bind a particular RNA sequence Prey Bait Prey could be proteins from a cDNA library
Yeast one-hybrid: Insert a DNA sequence upstream of the selectable or reporter Transform with candidate DNA-binding proteins (e.g., cDNA library) fused to an activator domain. Each T = one copy of a DNA target sequence
Indirect selection using a yeast 3-hybrid system:a more efficient glycosynthase enzyme Directed Evolution of a Glycosynthase via Chemical Complementation Hening Lin,† Haiyan Tao, and Virginia W. Cornish J. AM. CHEM. SOC. 2004, 126, 15051-15059 Turning a glycosidase into a glyco-synthase Glycosidase: Glucose-Glucose (e.g., maltose) + H2O 2 Glucose
Indirect selection using the yeast 3-hybrid system(one of the hybrid moelcules here is a small molecule) e.g., from a mutated library of enzyme glycosynthase genes glucose Leu2 gene Leu2 gene Transform a yeast leucine auxotroph. Provide synthetic chimeric substrate molecules. Select in leucine-free medium. DHFR = dihydrofolate reductase GR = glucocorticoid receptor (trancription factor ) MTX = methotrexate (enzyme inhibitor of DHFR) DEX = dexamethasone, a glucocorticoid agonist, binds to GR AD = activation domain, DBD = DNA binding domain
Selection of improved cellulases via the yeast 2-hybrid system Survivors are enriched for cellulase genes that will cleave cellulose with greater efficiency (kcat / Km) Yeast cell Cellobiose (disaccharide) URA-3 (toxic) cellulase Directed Evolution of Cellulases via Chemical Complementation. P. Peralta-Yahya, B. T. Carter, H. Lin, H. Tao. V.W. Cornish. JACS 2008, 130, 17446–17452 x x x x Library of cellulase mutant genes (one per cell)
URA-3 = gene for orotidine phosphate (OMP) decarboxylase Pathway to pyrimidine nucleotides: How does the URA-3 system work? analog 5-fluoroorotic acid 5-Fluoro-OMP URA-3 decarboxylation (pyr-4) 5-Fluoro-UMP Uridine kinase Exogenousuridine Thymidylate Synthetase inhibition RNA Death Ura3+ is FOA sensitive; ura3- is FOA resistant
Measuring protein-protein interactions in vitro X=one protein Y= another protein Pull-downs: Binding between defined purified proteins, at least one being purified. Tag each protein differently. Examples: His6-X + HA-Y; Bind to nickel ion column, elute (his), Western with HA Ab GST-X + HA-Y; Bind to glutathione ion column, elute (glutathione), Western with HA Ab His6-X + 35S-Y (made in vitro); Bind Ni column, elute (his), gel + autoradiography. No antibody needed. (HA = influenza virus flu hemagglutinin) glutathione = Gamma-glutamyl-cysteinyl-glycine.
Example of a result of a pull-down experiment Also identfy by MW (or mass spec) Total protein: no antibody or Western (stained with Coomassie blue or silver stain) Antibody used in Western Compare pulled down fraction (eluted)with loaded
Western blotting To detect the antibody use a secondary antibody against the primary antibody. The secondary antibody is fusion protein with an enzyme activity (e.g., alkaline phosphatase). The enzyme activity is detected by its catalysis of a reaction producing a luminescent compound. http://www.bio.davidson.edu/courses/genomics/method/Westernblot.html
Detection of antibody binding in western blots Antibody to protein on membrane Alkaline phosphatase fusion Non-luminescent substrate-PO4 = Y Y Luminescent product + PO4= Secondary antibody-enzyme fusion(e.g., goat anti-rabbit IgG) Detect by exposing to film Protein band on membrane
Far western blotting to detect specific protein-protein interactions. Use a specific purified protein as a probe instead of the primary antibody To detect the protein probe use an antibody against it. Then a secondary antibody, a fusion protein with an enzyme activity. The enzyme activity is detected by its catalysis of a reaction producing a luminescent compound. protein protein http://www.bio.davidson.edu/courses/genomics/method/Westernblot.html
Expression via in vitro transcription followed by in vitro translation T7 RNA polymerase binding site (17-21 nt) VECTOR cDNA ….ACCATGG….. Radioactively labeled protein 1. Transcription to mRNA via the T7 promoter + T7 polymerase 2. Add to translation system: rabbit reticulocyte lysate or wheat germ lysate Or: E. coli lysate (combined transcription + translation) All commerically available as kits Add ATP, GTP, tRNAs, amino acids, label(35S-met), May need to add RNase (Ca++-dependent) to remove endogenous mRNA In lysate NOTE: Protein is NOT at all pure (100s of lysate proteins present), just “radio-pure”
A A A X X X X X X X X Y Y Y Y C Y D D B D C B C Y Y Y B A A A A A A A A A A A A A A A • Co-immunoprecipitation • Most times not true precipitation, which requires about equivalent concentrations of antigen and antibody • Use protein A immobilized on beads (e.g., agarose beads) • Protein A is from Staphylococcus aureus: binds tightly to Immunoglobulin G (IgG) from many species. Does X interact with Y in the cell or in vitro? incubate + + anti-X IgG Or cell extract + Protein A + Wash by centrifugation (or magnet) Elute with SDS Detect X, Y in eluate by Western blotting