260 likes | 452 Views
Genetical Genomics in the Mouse. Finding Genes with Microarray Expression Data. Genetical Genomics. Jansen, R.C. and J.P. Nap (2001). Genetical genomics: the added value from segregation. Trends Genet 17(7): 388-91. Mouse Genetical Genomics. BXD recombinant inbred lines
E N D
Genetical Genomics in the Mouse Finding Genes with Microarray Expression Data
Genetical Genomics Jansen, R.C. and J.P. Nap (2001). Genetical genomics: the added value from segregation. Trends Genet 17(7): 388-91.
Mouse Genetical Genomics • BXD recombinant inbred lines • 21 strains + parents and F1 • genotypes • 508 markers • traits • forebrain RNA assayed by Affymetrix U74Av2 • PM probe sequences • MM probe sequences • 1 to 4 microarrays per RI line (average 2.5)
QTL mapping by regression • Trait vs genotype association • Genetically determined difference • in expressed RNA level • in hybridization of probe sequence • in competing hybridization • Measured by LRS (likelihood ratio statistic)
Trait Data Preparation • 12,422 probesets (traits) • 16 PM & 16 MM probes (oligonucleotides) • average PM-MM difference • log2-transform average difference • normalize data of each microarray to common mean and standard deviation • average replicate microarrays • 400,000 PM & MM probes (cells) • log2-transform cell intensity • normalize and average replicate arrays
Multiple testing problem • Two levels of multiple testing • Each trait or probe vs 508 loci • 12,422 traits or 400,000 probes • Strategy • Empirical p-value for multiple loci • measures significance of single best association • Benjamini-Hochberg procedure for multiple traits or probes • may declare many significant associations • assumes at least one significant association
Empirical p-value • Measures genome-wide significance • converts multiple test into single test • significance of best association among all loci • Permutation test for distribution under null • up to 106 scans with permuted trait values • record largest LRS for each permutation • Find p-value of original regression from its rank in the null distribution
Outliers • Examine permutation test distribution for bimodality • Compare 37th and 95th percentile values • Find outlier and assign next most extreme value • Redo permutation test and regression
Benjamini-Hochberg test • Test of 100 uniformly distributed p-values (p-values from non-significant results) • P-values as blue dots • Significance threshold for FDR = 0.2 as red line
Declare significant Benjamini-Hochberg test • Test of 10 low p-values (significant results) mixed with 90 p-values from non-significant results • P-values as blue dots • Significance threshold for FDR = 0.2 as red line • Eleven cases declared significant
Empirical P-value Calculation 500x Permutation test Marker regression mapping ? p-value 5000x Perm Maximumgenome-wideLRS ? p-value 50000x Perm ? p-value 1000000x Perm p-value
Trait-locus associations • Ranked P-values as blue dots (90 smallest from 12,422) • Significance threshold as red line • Cases below red line are significant for FDR = 0.2 • 75 significant trait-locus associations
Probe-locus associations • Ranked P-values as blue dots (600 smallest from ~400,000) • Significance threshold as red line • Cases below red line are significant for FDR = 0.2 • 576 significant probe-locus associations
QTLs from MM probes • 576 QTLs defined by single microarray probes • 454 (79%) by PM probes • 122 (21%) by MM probes • Proportion of PM probes QTLs declines as p-value increases A B C
QTLs from cell-level mapping • 576 cell-marker associations (QTLs) • 339 traits (probesets) represented • most probesets represented by a single probe • rarely, two or more significant probes from same probeset • all probes from one probeset identify same locus • 79% of probes are PM
QTLs from PM cells only • 454 PM cells defining QTLs • 288 traits (probesets) represented • 184 controlled by location on the same chr • 88 controlled by location on different chr • 16 unknown location for probeset • 147 locations (marker loci) with nearby QTLs, distributed on all chromosomes
Probe-locus associations among traits • 339 traits (probesets) with probes identifying significant QTLs • 186 traits represented by a single probes • 2 traits represented by 10 probes
QTL distribution among marker loci • 147 loci identified by at least one significant probes-locus association • multiple associations to one locus • multiple probes from one probeset • multiple QTL near locus
Profiles of probe sensitivity Li, C & Wong, WH (2001) Model-based analysis of oligonucleotide arrays: Expression index computation and outlier detection. PNAS 98: 31-36
Probe profiles (best) • LRS vs probe number • Probesets with highest significance in probeset-level mapping PMMM
Probe profiles (worst) • LRS vs probe number • Probesets with lowest significant association in probeset-level mapping PMMM
Chr 9 QTLs • Unusual number of chr 9 QTLs (22) controlling sequences on other chrs • Normalized frequency 3-fold greater than average chr • Many of these QTLs cluster near 2 loci on chr 9
Robert W Williams Lu Lu S Shou Yanhua Qu Elissa Chesler John D Mountz Hui Chen Hsu David Threadgill Gene Hwang Dan Nettleton Jintao Wang Ram Varma Jianxin Wang Mark Brady Gene Sobel Acknowledgments U Tennessee, Memphis Gene Expression Core Bioinformatics U Alabama, Birmingham GOG U North Carolina Cornell U Iowa State U