1 / 106

Canadian Bioinformatics Workshops

Canadian Bioinformatics Workshops. www.bioinformatics.ca. Module #: Title of Module. 2. Module 8 Gene Expression Profiling. Paul Boutros Bioinformatics for Cancer Genomics May 26-30, 2014. Course Overview. 08:30 – 10:45 Expression Profiling in Cancer Genomics

avictor
Download Presentation

Canadian Bioinformatics Workshops

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Canadian Bioinformatics Workshops www.bioinformatics.ca

  2. Module #: Title of Module 2

  3. Module 8Gene Expression Profiling Paul Boutros Bioinformatics for Cancer Genomics May 26-30, 2014

  4. Course Overview 08:30 – 10:45 Expression Profiling in Cancer Genomics Microarray Pre-Processing Basics 11:15 - 12:30 Guided Analysis of a Microarray Study

  5. Learning Objectives of Module • Understand the types of microarrays that exist • Identify the sources of noise in a microarray experiment • Appreciate the complete microarray analysis pipeline • Learn to input raw microarray data into R/BioConductor • Learn to pre-process raw microarray data • Perform standard statistical analyses on microarray data

  6. Let’s start off with a question What do expression microarrays actually measure?

  7. Session Overview What are microarrays? What are microarrays used for? Molecular Aspects Biological Aspects Downstream Analyses How is microarray data analyzed? Workflow overview

  8. What is a Microarray? “A DNA microarray is a multiplex technology consisting of thousands of oligonucleotide spots, each containing picomoles of a specific DNA sequence.” Used to quantitate mRNA or DNA Many applications: mRNA or DNA levels SNP identification ChIP-on-Chip

  9. Hypotheses Microarrays are usually hypothesis-generating: They highlight specific genes or features that are particularly interesting for follow-up experiments There are many interesting exceptions Biomarkers Pathway analyses This does not reduce the importance of experimental design the low statistical power of array studies make good design even more important and very challenging

  10. Input Samples The nature of the sample is critical: * Unfrozen vs. Frozen vs. FFPE * Total RNA vs. poly-A RNA vs. other subsets

  11. Microarray Basics Imagine a one-spot microarray… Target DNA… … is labeled … and hybridized … and washed. Finally, scan the chip. Target Chip Feature Probe

  12. These Are Spotted Arrays Robotically printed onto a series of glass slides using a robot with needle-heads. Product a characteristic gridding pattern and almost always use two samples simultaneously (two-colour).

  13. Other Types of Arrays Inkjet Arrays Photolithographically generated arrays Bead arrays Protein/cell/lipid-arrays More “niche” applications Not discussed here

  14. InkJet Arrays In 1999, HP spun off its life-science and measurement division into Agilent Technologies. The new company wanted to determine if printer technology could be harnessed to generate microarrays.

  15. Inkjet Array Manufacture Involves Sequential Nucleotide Addition

  16. Photolithographic Arrays Produced by the techniques for the production of transistors. Mostly pioneered by the company Affymetrix, although other suppliers exist (e.g. Nimblegen) We will be working with Affymetrix data later, so we will walk through the platform in significant detail

  17. The Glass Matrix Addition of Linker molecule Silination

  18. Photolithographic Synthesis Photolithographic mask

  19. Deprotection

  20. Nucleotide Addition

  21. Nucleotide Addition

  22. Nucleotide Addition

  23. Capping Agents

  24. Final Chip Wafer Feature Chip

  25. RNA Wash

  26. RNA Wash

  27. An Affymetrix Microarray

  28. Self-Assembling Bead-Arrays Produced by Illumina 3 μm silicon beads, randomly placed coated with ~105 identical 25bp probes probes have identifying barcode (address) sequences Labeled cDNA bead address probe

  29. Comparing Array Platforms Data Quality Price Oligos Bioinformatics Research Platform Spotted cDNA $ variable + +++ Affymetrix $$$ 25 bp +++ +++ $$ ~70 bp ++ ++ Inkjet Bead Arrays $$ ~25 bp ++ + I do not endorse specific platforms – they all have their strengths and weaknesses

  30. Session Overview What are microarrays? What are microarrays used for? Molecular Aspects Biological Aspects Downstream Analyses How is microarray data analyzed? Workflow overview

  31. What Are Microarrays Used For?Molecular mRNA abundances Splicing (quantitate different isoforms) mRNA degradation rates (half-life) mRNA translation rates RNA capture (RIP) DNA RNA Other • DNA sequence (SNPs) • DNA copy-number • DNA capture (exome, ChIP) • Tag quantitation (genetic screening) • Protein arrays • Cell based arrays • Lipid arrays

  32. What Are Microarrays Used For?Biological mRNA abundances Splicing (quantitate different isoforms) mRNA degradation rates (half-life) mRNA translation rates RNA capture (RIP) RNA * Candidate Gene Identification * Pathway Analysis * Model Characterization * Classifiers/Predictive Models * Drug-Analysis (Dose/Time/Class) * Integration Analysis

  33. Session Overview What are microarrays? What are microarrays used for? Molecular Aspects Biological Aspects Downstream Analyses (upcoming sessions: pathways & clinical integration) How is microarray data analyzed? Workflow overview

  34. Spot Cy3 Cy5 Background Spot Quality Inter-array Intra-Array Significance Testing Spot List Clustering Integration Each Spot is a Probe A) Remove Noise Quantitation B) Extract Data ?

  35. Step #1: Image Quantitation Why? Quantitative vs. Qualitative How? Image Segmentation Difficulty? +++ Research?+

  36. Image Segmentation 101:Find Grids 1. Find Grids 2. Find Spots 3. Spot Outline

  37. Image Segmentation 101:Find Spots Key Step: Integrate Signal Across Array

  38. Image Segmentation 101:Challenges Problems: Stray Signal Missing Spots Gross Deformities Manual Validation

  39. Research? Surprisingly, not much investigation This is probably a source of error in all studies Manual checking of spot-detection remains the norm Problematic as studies & arrays get larger

  40. Spot Cy3 Cy5 Background Spot Quality Inter-array Intra-Array Significance Testing Spot List Clustering Integration Quantitation ?

  41. Step #2: Background Correction Why? Remove Stray Signal How? Model-based Difficulty?++++ Research?++

  42. Spot Segmentation Signal ??? Background

  43. So what do we get? Background Intensity: BG Foreground Intensity: FG If BG > FG Then -ve Signal NO! Isn’t it simple? Signal = FG - BG 0.1-2% of spots

  44. Why Might This happen? In 2001 two papers showed that empty spots have less signal than background Unbound spots correspond to low-expression genes Background Intensity: BG Foreground Intensity: FG Thus unbound spots are particularly prone to problems

  45. So What to Do? Heavy-duty mathematical tools employed Three major models developed: Edwards log-linear Smyth normexp Kooperberg Bayesian The math is extremely advanced, so we’ll skip that for now. Let’s summarize the methods instead.

  46. Comparison Speed Accuracy Method Good Edwards Fast Better NormExp Slow Best Kooperberg Very Slow No strong criteria for selecting between these algorithms.

  47. Spot Cy3 Cy5 Background Spot Quality Inter-array Intra-Array Significance Testing Spot List Clustering Integration Quantitation ?

  48. Step 3: Spot Quality Why? Identify artefacts How? Unknown Difficulty?+++++ Research?+

More Related