170 likes | 344 Views
Filtering and Normalization of Microarray Gene Expression Data. Waclaw Kusnierczyk Norwegian University of Science and Technology Trondheim, Norway. Outline. Filtering: spots removal of spots based on quality measures Normalization compensation for measurement errors
E N D
Filtering and Normalizationof Microarray Gene Expression Data Waclaw Kusnierczyk Norwegian University of Science and TechnologyTrondheim, Norway
Outline • Filtering: spots • removal of spots based on quality measures • Normalization • compensation for measurement errors • Filtering: genes (significance analysis) • identification of significantly expressed genes
Filtering: Spots • Criteria used to remove spots • spot area [pixels] • signal/noise ratio (spot intensity vs. background intensity) • other quality measures (e.g. based on quality scores from image analysis software) • morphological criteria • pixel-level variability
Filtering: Spots • Spot area based filtering • remove spots with area <threshold in both channels • problem: setting an appropriate threshold • dependent on the definition of a spot (image analysis software), and the distribution of the spot area • typical value: 10 pixels
Filtering: Spots • Signal/noise based filtering • keep spots with signal / background > threshold in both channels • problem: setting an appropriate threshold • dependent on the spot and background definition (image analysis software) • typical value: sgn/bkg > 2 or, equivalent,sgn - bkg > bkg
Filtering: Spots • Other criteria • Intensity threshold on background corrected intensity • Spot quality measures (pixelwise distributional properties of spot and background intensities, manual morphology-based spot flagging etc.) • Replicate-based spot filtering (adaptive threshold selection based on a repeatability coefficient, coefficient of variation etc.)
Normalization • Analysis of systematic errors • adjustment for bias coming from variation in the technology rather than from biology • Different sources of non-linearity • Efficiency of dye incorporation (labelling) • Print-tip differences • Non-uniformity in hybridisation • Scanning • Between slide variation (print quality, ambient conditions)
Normalization • Selection of elements • Housekeeping genes, spike controls, tip-dependence, raw data, between array normalization • Method • Constant subtraction normalization (mean/median log2 ratio, iterative c estimation, ANOVA) • Locally weighted mean normalization (intensity or location dependent) • Other recently proposed methods
Normalization (example 1) • Intensity dependent normalization with locally weighted mean, global
Normalization (example 1) • Intensity dependent normalization with locally weighted mean, print-tip dependent
Normalization (example 1) • Intensity dependent normalization with locally weighted mean, global vs. print-tip dependent
Normalization (example 2) • Intensity dependent normalization with locally weighted mean, print-tip dependent
Normalization • Location dependent normalization with locally weighted mean (from SNOMAD web page)
Normalization • Local variance correction across element signal intensity (from SNOMAD web page)
Acknowledgments Mette Langaas Department of Mathematical Sciences, Norwegian Institute of Science and Technology Astrid Lægreid Department of Physiology and Biomedical Engineering, Norwegian Institute of Science and Technology Per Kristian Lehre Department of Computer and Information Science,Norwegian Institute of Science and Technology