330 likes | 545 Views
An Introduction to Microarrays. Ellen Wisman Michigan State University AFGC http://AFGC.Stanford.edu/. Uses of Microarrays. RNA Expression profiles DNA profiling Comparative genomic hybridizations Transcription binding mapping SNP mapping Mini-sequencing. Proposal Facts.
E N D
An Introduction to Microarrays Ellen Wisman Michigan State University AFGC http://AFGC.Stanford.edu/
Uses of Microarrays • RNA Expression profiles • DNA profiling • Comparative genomic hybridizations • Transcription binding mapping • SNP mapping • Mini-sequencing
Proposal Facts Cycles 1 through 3: Number of Customer and Collaborator slides 236 Number of Slides Publicly available in SMD 178 Through Cycle 5: Total Number of Proposals 115 Total Number of Customers 111 Projected Total Number of Slides408 AFGC experiments40
Types of experiments Organisms Studied 88 experiments use Arabidopsis thaliana. 2 experiments use Brassica. 2 experiments use Thlaspi caerulescens. 1 experiment uses Alyssum lesbiacum Arabidopsis95% Thlaspi caerulescens 2% Brassica 2% Alyssum lesbiacum 1% Arabidopsis 95% Types of experiments 45% Genotype comparisons/transgene/antisense 34% Treated plants vs. wild type 8% Studying the effects of pathogens 7% Treatment of mutant 4% Studying development 2% Tissue comparisons V.S.
Principles of array production • In situ oligos synthesis (=directly on the slide) • Photolithography, light directed synthesis using a mask, 25-mers (Affymetrix) • Ink-jet printing process, 60-mers (Rosetta) • Others
Principles of array production • Spotted arrays • cDNA clones • ESTs • GSTs gene specific primers • Oligonucleotides • Genomic clones • Genomic DNA
cDNA microarrays Array Labeled RNA 1) Reference 2) Experimental + cDNA clones RNA –cy5 RNA –cy3 Hybridize Expression Ratio > 2 higher in 1 than 2 1 same between 1 and 2 lower in 1 than 2 < 0.5
SMD Data Analysis & Presentation • Acquisition(e.g. ScanAlyze®, GenePix®) • Input/Storage/Retrieval(Stanford Microarray Database) • Analysis/Pattern Recognition/ Visualization(Tree View, Clustering, Self-Organizing Maps, K-means, “R”, Gene Spring®) • Interpretation/Annotation • Publication/Repository(TAIR, GenBank) M. Cherry, Stanford University
The AFGC array • Microarray Design • >100,000 Arabidopsis ESTs Compare to each other (BLASTN) then to all Arabidopsis proteins (BLASTX) • 9,200 EST clones from all tissues • 2,000 EST clones from developing seeds • 3,000 GSTs, gene specific tags Former array >11,000 clones 99% of the re-sequenced clones were correct Analysis Performed by Rob Ewing, Stanford University
PCR Products • Classified in: • No band • Low Concentration • Double band • Smear
Microarray ControlsAvailable from NASC • Negative/spiking controls • Heterologous sequences spotted either to determine background and non- specific hybridization or to serve as external controls by spiking the corresponding RNA into the labeling reaction (human clones) • Transgenes • For quantification of reporter constructs (can also serve as negative controls) (BAR, BT, BASTA, Luciferase…….) • Positive controls • Dilutions of chromosomal DNA
Control spots: Genomic DNA • Arabidopsis Genomic DNA digested with RSA1 • Spotted at different concentrations Carryover Intensity
Data manipulation • Remove flagged spots • Remove spots with bad quality defined as % of pixels 1.5% above background Excel Access Other commercial available programs • Normalization = Adjust the signal intensities for each channel to make the two channels comparable
Data Distribution before and after Normalization 1200 1000 cy3 cy5 800 600 400 200 0 2 5 8 11 3.5 6.5 9.5 2.75 4.25 5.75 7.25 8.75 10.3 Number of clones 1400 cy3 cy5 1200 1000 800 600 400 200 0 0 1 2 3 -3 -2 -1 0.5 1.5 2.5 -2.5 -1.5 -0.5 Log of Intensities
Distribution of ratios 2000 2x 2x 1500 Number of clones 1000 500 0 2 1 3 -2 -3 1.2 1.4 2.4 2.6 1.6 1.8 2.2 2.8 -2.4 -2.2 -1.8 -1.2 -2.8 -2.6 -1.6 -1.4 Ratios
Identification of false positives in slides hybridized with identical RNAs in both channels
(I) Intensities (II) Ratios (M) vs. Intensities (A) (III) Spatial Ratios Slide 4, class (a) Slide 4, class (a) Slide 4, class (a) Slide 6, class (c) Slide 6, class (c) Slide 6, class (c)
a a+b a+b+c c b+c 1(a)+1(c) Frequency of false positives among 6 slides 10000 1000 Number of clones (log scale) 100 10 1 -3 -2.8 -2.6 -2.4 -2.2 -2 -1.8 -1.6 -1.4 -1.2 1.2 1.4 1.6 1.8 2 2.2 2.4 Ratio
A 2-fold cutoff yields 25-35% non-reproducible that can be removed following multiple replicates Reproducible (% clones in final set) Worst pair to best pair Non-reproducible Best pair to worst pair 80% 100% 70% 60% 80% 50% Percentage of Non-reproducible spots 60% 40% 30% 40% 20% 20% 10% 0% 0% A3/A4 A7/A8 A5/A6 0 2 4 6 8 Slide pairs Number of slides R.Gutierrez
Repetitions • How many repetitions? • Minimum is a technical repetition (dye swap) • Recommended at least one other biological repetition • Experimental considerations, small changes need more repetitions
Uses of Microarrays • RNA Expression profiles • Type 1: Direct comparisons two different RNAs • Type 2: Multiple comparisons or RNA, via common reference
Type 2: Multiple comparisons • Use of a common reference allows to compare experiments directly • Direct comparison of many different growth conditions or tissues • Time course • DNA as common reference • Hybridizes equally to each spot • Consistent between slides • Unlimited supply
1200 1000 800 600 400 200 0 Distribution of intensities of genomic DNA 1800 1600 RNA intensities DNA intensities 1400 No. Clones 2 5 8 11 0.2 0.8 1.4 2.6 3.2 3.8 4.4 5.6 6.2 6.8 7.4 8.6 9.2 9.8 10.4 11.6 12.2 Relative intensities
Identification of low intensity spots • Low intensity spots include: • defects in printing • poorly amplified clones 0 0.2 0.4 0.6 Relative Intensities
Comparison of common reference vs Direct comparison 6 slides: Time 0 hr - Time12 hrs Time 0 hr - Common Reference Time 12 hrs - Common Reference Direct Ratio Ratio via common ref
Comparison of direct vs DNA as reference via DNA ratio (log) Ratios correlate well for higher values, smaller differences may not be detected in type II experiments Direct ratio (log)
Consistent trends between heterologous species Conserved Genes Expressed Preferentially in Leaves Conserved Genes Expressed Preferentially in Shoot Apices D. Horvath
L M L M Adenosylhomocysteinase (Cytokinin-binding) Ubiquitin-conjugating Enzyme Glyceraldehyde 3-Phosphate Dehydrogenase ATP Synthase Guanine Nucleotide Binding Protein Elongation Factor 1B Differential expression of the homologues genes confirmed in leafy spurge D.Horvath
Summary AFGC contributions • Good source of clones • Spotting conditions • Hybridizing and labeling techniques • Comparison labeling from Total and PolyA+ RNA • Slide coatings • Hybridization solutions • Labeling from low amounts of RNA • Experimental set up • Type II experiments • Heterologous probes • Number of repetitions • Comparison between spotted arrays and Affymetrix chips • Data Analysis tools via SMD
Future perspective global gene expression studies • Gene discovery • Global view (circadian rhythm, epigenetic changes) • Approach old problems (hybrid vigor) • Diagnostic tool (transgenes) • New hypotheses (looking for patterns over many experiments)
Acknowledgements PRL-MSU Robert Schaffer Jeff Landgraf Matt Larson Dave Green Monica Accerbi Verna Simon Kim Trouten Sergei Mekhedov Ellen Wisman John Ohlrogge Ken Keegstra Pam Green Stanford University Shauna Somerville Mike Cherry