170 likes | 281 Views
Plate Effects in cDNA Microarray Data. Henrik Bengtsson hb@maths.lth.se Mathematical Statistics Centre for Mathematical Sciences Lund University. Outline. Intensity dependent effects A new way of plotting microarray data Plate effects Plate normalization Measure of Fitness Results
E N D
Plate Effects in cDNA Microarray Data Henrik Bengtsson hb@maths.lth.se Mathematical Statistics Centre for Mathematical Sciences Lund University
Outline • Intensity dependent effects • A new way of plotting microarray data • Plate effects • Plate normalization • Measure of Fitness • Results • Discussion
Data • Matt Callow’s ApoAI experiment (2000): • (8 ApoAI-KO mice vs. pool of 8 control mice),8 control mice vs. pool of 8 control mice. • 5357 ESTs/genes (6 triplicates, 175 duplicates, 4989 single spotted) & 840 blanks=> 6384 spots in all. • Labeled using Cy3-dUTP and Cy5-dUTP. • Signals extracted from images by Spot.
Intensity dependent effects The log-ratio, M, depends on the intensity of the spot, A.
Print-tip effects The log-ratio (and its variance) depends on printtip group. How are the spots printed…?
Print order plot The spots are order according to when they were spotted/dipped onto the glass slide(s).
Plate effects The log-ratios depends on the plate the spotted clone comes from. (384-well plates from 6 different labs were used)
Plate Normalization Assumption: The genes from one plate are in averagenon-differentially expressed. Correctness? Are clones on the plates selected randomly? Spots on plates are less random that for instance spots in print-tip groups. The ApoAI mouse experiment is a comparison between 8 control mice and the pool of them. Even if clones on plates were from different tissues, e.g. plate 9-12 from brain, in this setup it should not affect the ratios, just the strength of the signals.
Intensity normalization • Intensities (A) also have plate effects. • Intensity normalization => plate biases again! Should we normalize A for plate? Probably not!Blanks and ”brain” spots have lower intensities, whereas the ”liver” spots have higher...
cDNA clones excitation red laser Plate effects(?) green laser PCR product amplification purification emission Reference sample Test sample printing Intensity effects (quenching) RNA RNA Intensity effects (labelling efficiency) cDNA cDNA overlay images Hybridize Sources of Artifacts scanning Production data: (R,G,...)
Several possible approaches ;( • Decisions to make: • Background correction? • Plate normalization? • Intensity (slide, print-tip or scaled print-tip) normalization? • Platewise-intensity normalization? • If both plate and intensity normalization, in what order? Maybe plate-intensity-plate-intensity-plate-... and so on? • Need a way to compare different approaches...
Important. Compare on the same scale! Measure of Fitness Median absolute deviation (MAD) for gene i: di = 1.4826 · median | rij | where rij = Mij – median Mij is residual j for gene i. The measure of fitness is defined as the mean of the genewise MADs: m.o.f. = di / N where N is the number of genes. (...or or look at the density of the di ’s)
Visual comparison between the ”best” Slidewise intensity normalization: (m.o.f.=0.228) Plate+print-tip int.+plate normalization:(m.o.f.=0.188)
m.o.f. Results • Removing plate biases first significantly lowers the gene variabilities. (15-20% lower than intensity normalization only) • It is critical not to dobackground correction. • Using measure of fitness is helpful in deciding what to do. bg – background corrected, P – Plate biases removed, S – slide-intensity normalized,B – printtip-intensity normalized, sB – scaled printtip intensity normalized.
Discussion • What are the reasons for plate effects and where do they actually occur? i) On the plates, ii) during printing or iii) at hybridization? • How should one best standardize the measure of fitness? i) Based an all spot, ii) on a subset (blanks?), or iii) ?
Acknowledgements Statistics Dept, UC Berkeley: * Sandrine Dudoit * Terry Speed * Yee Hwa Yang Lawrence Berkeley National Laboratory: * Matt Callow Ernest Gallo Research Center, UCSF: * Karen Berger Mathematical Statistics, Lund University: * Ola Hössjer com.braju.sma - object oriented extension to sma (free): http://www.braju.com/R/ [R] Software (free): http://www.r-project.org/ The Statistical Microarray Analysis (sma) library (free): http://www.stat.berkeley.edu/users/terry/zarray/Software/smacode.html