1 / 39

Sequence Analysis & Gene Expression

Sequence Analysis & Gene Expression . Organism selection: genome size – why – what is the benefit - politics Decisions: mapping first, “shotgun sequencing”, BAC alignment/sequencing [BAC – bacterial artificial chromosome; also YAC (yeast)]

amos
Download Presentation

Sequence Analysis & Gene Expression

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Sequence Analysis & Gene Expression Organism selection: genome size – why – what is the benefit - politics Decisions: mapping first, “shotgun sequencing”, BAC alignment/sequencing [BAC – bacterial artificial chromosome; also YAC (yeast)] Genome sequence: raw sequence – confirmed sequence gene models – verification Verification: is the gene model transcribed? Yes/no/perhaps “ubiquitous” gene, family specific, homolog - ortholog - paralog Transcript profiles: when – how much [abundant] – where transcript “variants” – inducible by condition X? MUPGRET workshop, Columbia, MO, June 2005 (HJ Bohnert, UIUC) bohnerth@life.uiuc.edu

  2. B Y X A TP Mal Genomics … not just genes genome & transcriptome sequences protein interaction maps markers & QTLs ATCCGAAGCG CTTGGAAAA biochemical genetics expression profiles Databases, Integration & Intuition knock-out sRNA & RNAi dynamic metabolite catalogs protein localization structure analysis How (much) will ‘encyclopedic’ approaches lead to better understanding? information mining, hypotheses, experiment - insight, application, virtual life

  3. O3 CO2 control Columbia grown in Soy-FACE Arabidopsis – model plant small, fast, prolific, mutants, lines, ecotypes, genome sequence Field on a dish!

  4. 10 20 30 5 Mb PIP1;3 TIP3;2 TIP2;xpseudo TIP3;1 NIP6;1 NIP3;1 Ch-1 (15) PIP2;8 NIP3;1pseudo TIP4;1 NIP2;1pseudo NIP2;1 PIP2;6 PIP1;2 Ch-2 (4) rDNA TIP1;1 PIP2;2 PIP2;3 SIP1;1 NIP7;1 TIP2;1 TIP1;2 TIP5;1 PIP2;1 SIP2;1 Ch-3 (14) PIP1;1 PIP2;5 PIP1;4 TIP1;3 NIP5;1 TIP2;2 NIP1;1 NIP1;2 PIP1;5 PIP2;7 Ch-4 (3) SIP1;2 NIP4;1 NIP4;2 TIP2;3 PIP2;4 Ch-5 (12) - duplicated regions that include AQPs. AQP are distributed over all Chromosomes - a few clusters, many duplications Arabidopsis thaliana AGI, 2000

  5. Ecosystem – population – species – ecotype (- breeding line) Organism – organ – tissue – cell – compartment Nucleus – envelope & pore – nucleoplasm, nucleolus & chromosomes Euchromatin & heterochromatin – gene islands – gene Promoters – 5’-regulatory (untranslated = UTR) – introns & exons – mature coding region – 3’-regulatory (UTR) regions The Plant Genome Plants in silico? Sure! And then: Plant Design from Scratch

  6. Controls for Gene Expression – many Switchboards The Plant Genome • Chromatin condensation state • Local chromatin environment • Transcription initiation • Transcript elongation • mRNA splicing • mRNA export • mRNA place in the cell • RNA half-life • Killer microRNAs • Ribosome loading • Protein transport/targeting • Protein modifications • Protein turnover Levels of regulation that affect what we call “gene expression”

  7. Result: no protein - i.e., gene is essentially “silenced” The Plant Transcriptome 5 years ago, we did not know that such a control system existed! Killer RNAs (there are micro-genes) microRNAs

  8. The Plant Transcriptome How to sample the transcriptome? • Morphological dissection • (root, leaf, flower - epidermis, guard cell, etc.) • Cell sorting • make single cells, send through cell sorter • (size, color, reporter gene) • Laser ablation • micromanipulation of laser to cut • individual cells • Biochemical dissection (compartment isolation) • chloroplasts, mitochondria, • ribosomes, other membranes Painting cells with a reporter gene - here is GFP Green Fluorescence Protein

  9. The Plant Transcriptome Painting tissues then isolating desired cells Enzymatic staining The Endodermis of the root tip is highlighted in transgenic plants using pSCR::mGFP5. Emerging lateral roots [requires plant transformation]

  10. The Plant Transcriptome cDNA – complementary DNA converts messenger RNA into double-stranded DNA • > cDNA libraries • “neat” • normalized • subtracted • > SAGE libraries “Normalization” removes mRNAs for which there are many copies in a cell – thus enriching for “rare mRNAs” (not so much sequencing to do) Subtraction removes cDNAs which you already know (less sequencing)

  11. Primary cDNA Library Library Normalization Total RNA primary cDNA library make ss-DNA out of primary library Poly(A)+ RNA 1st strand cDNA PCR inserts by T7 and T3 standard primers ss-DNA ds-cDNA DNA “tracer” Size-selected double stranded cDNA (>500 bp) DNA “driver” tracer/driver hybridization Ligate to EcoRI adapters/digest NotI column chromatogr. (double-strands stick) Clone (EcoRI/NotI) digested pBSII/SK+ & adaptored cDNA Primary (neat) library may be used for “normalization” Non-hybridized DNA from flow-through = normalized clones The Plant Transcriptome cDNA Libraries Cloning of root RNAs from segments S1 – S4 root tip (Sharp lab) sequenced ~18,000 clones found ~8,000 unique and ~130 novel genes How many genes make a root?

  12. Serial Analysis Gene Expression Velculescu et al. 1995 http://www.sagenet.org/

  13. coding region (known or expected) forward p. reverse p. Amplicon (sequence or clone + sequence) 1 2 3 4 5 6 7 8 9 10 M results

  14. Real-time PCR) (quantitative) RNA (DNA-free) to cDNA use product in dilutions for amplification Assumption each cycle increases amount by factor 2 (or 1.8) Check by using known amount of cloned control cDNA [cycle number] Serial dilution 1x - 1/5x - 1/25x - 1/125x

  15. Melting curves [single products] Single genes have been amplified here Two amplicons are shown Each shows a single melting curve

  16. Melting curves [multiple products] More than one gene has been amplified here Homologous genes [identity – similarity – divergence] orthologous – paralogous relationships

  17. The Plant Transcriptome Quantitative PCR in 384-well plates (96 primer pairs, 3 repeats each) Taking SAGE & cDNA sequences together - corn roots “express” 20-23,000 genes (i.e., mRNA is made) - The entire corn genome is expected to include ~50,000 genes

  18. Nylon Membrane Glass Slides GeneChip Substrates for High Throughput Arrays Single label 33P Single label biotin streptavidin Dual label Cy3, Cy5

  19. Pin pick-up volume 100-250 nl Spot diameter 75-200 um Spot volume 0.2-1.0 nl TeleChem ChipMaker2 Pins

  20. Creating cDNA Arrays Q-Pix cDNA cloned into vector and transformed to create cDNA library Clones sequenced and unique set chosen and reracked Unique set of clones 384 well microtiter plate PCR on Tecan workstation Slides printed on Cartesian Arrayer Final product

  21. NSF Soybean Functional Genomics Steve Clough / Vodkin Lab Printing Arrays on 50 slides

  22. O O O O O O O O O O O O O O O O O O O O O O O Si Si Si Si Si Si Si Si Si Si Si Aldehyde Silylated NSF Soybean Functional Genomics Steve Clough / Vodkin Lab Slide Chemistry Glass Coatings Amine Poly-L-lysine Silanated We use SuperAmine and SuperAldehyde from TeleChem (arrayit.com)

  23. NSF Soybean Functional Genomics Steve Clough / Vodkin Lab GSI Lumonics

  24. GenePix Image Analysis Software Placenta vs. Brain – 3800 Cattle Placenta Array cy3cy5

  25. Troubleshooting NSF Soybean Functional Genomics Steve Clough / Vodkin Lab The Good The Ugly The Bad

  26. Post-Print Processing Hot Printed slide Snap dry W ater Rehydrate spots UV light Chemically block background. Denature to single strands. Hybridize & Scan Fix DNA to coating

  27. Cells from condition A Cells from condition B mRNA Label Dye 1 Label Dye 2 cDNA Mix equal over under Ratio of expression of genes from two sources

  28. ScanArray 3000 Fluorescent Scanner

  29. Overlay Images Reverse Labeling Slide 1 Cy3 over-expressed Slide 2 Cy5 over-expressed

  30. Universal vs. Universal (control v. control) Problem area at low intensity readings

  31. Lung vs Control

  32. Clustered display of data from time course of serum stimulation of primary human fibroblasts. Cholesterol Biosynthesis Cell Cycle Immediate Early Response Signaling and Angiogenesis Wound Healing and Tissue Remodeling Eisen et al. Proc. Natl. Acad. Sci. USA 95 (1998) pg 14865

  33. Hierarchical Clustering: 14 Tissues 7653 Genes

  34. Differences in Technology Affymetrix One sample, one chip Single Color Scans Labeling by incorporating Biotin into cRNA not Cy3 or Cy5 dyes Oligonucleotides instead of full-length cDNAs Higher Density Arrays Feature sizes down to 18 mm instead of ~100 mm Non-contact Creation of Arrays GeneChips

  35. Affy Technology Overview • Photolithography and combinatorial chemistry • Technology from microchip industry: “GeneChip” • Coat slides • “Mask” to apply light to only desired features, de-protects feature

  36. Technology Overview (cont.) • Apply required nucleotide base to array • Apply new mask to de-protect different features • Stack nucleotides on top of one another • Repeat with bases and masks until 25-mer oligonucleotides are built directly onto array

  37. Technology Final Steps • Silicon “wafers” of 90 arrays are cut • Glass substrate is then added to plastic cartridge for: • Safe handling • Easy storage • Easy hybridization • Easy scanning Easy, convenient Expensive (very much so) No confirmation of quality Erroneous data when low intensity Problems with SNPs* *not with 70-mer oligo glass slides

  38. Questions? Give me a call or send a message 217-265-5475 bohnerth@life.uiuc.edu http://www.life.uiuc.edu/bohnert/ Remember: YOU CAN ALWAYS FIND EVERYTHING ON GOOGLE! (though not these slides)

More Related