280 likes | 389 Views
Modifying the Schwarz Bayesian Information Criterion to locate multiple interacting Quantitative Trait Loci 1. M.Bogdan, J.K.Ghosh and R.W.Doerge, Genetics 2004 167: 989-999. 2. M.Bogdan and R.W.Doerge “Mapping multiple interacting QTL by multidimensional genome searches’’.
E N D
Modifying the Schwarz Bayesian Information Criterion to locate multiple interacting Quantitative Trait Loci1. M.Bogdan, J.K.Ghosh and R.W.Doerge,Genetics 2004 167: 989-999.2. M.Bogdan and R.W.Doerge “Mapping multiple interacting QTL by multidimensional genome searches’’
Xia- genotype of i-th individual at locus a Xia = 1/2 - individual is heterozygous at locus a Xia = -1/2 - individual is homozygous at locus a dab=10 cM - ρ (Xia, Xib) = 0.81 Data for QTL mapping Y1,...,Yn - vector of trait values for n backcross individuals X=[Xij], 1 ≤ i ≤ n, 1 ≤j ≤ m - genotypes of m markers
Standard methods of QTL mapping One QTL model 1. Search over markers - fit model (1) at each marker and choose markers for which the likelihood exceeds a preestablished threshold value as candidate QTL locations.
Interval mapping Lander and Botstein (1989) • Consider a fixed position between markers
Estimate μ, β, and σ by EM algorithm and compute the corresponding likelihood. • Repeat this procedure for a new possible QTL location. • Plot the resulting likelihoods as the function of assumed QTL position.
Problems with interval mapping a) Not able to distingush closely linked QTL b) Not able to detect epistatic QTL (involved only in interactions) Solution Estimate the location of several QTL at once using multiple regression model (Kao et al. 1999)
Problem : estimation of the number of additive and interaction terms Xij - genotype of j-th marker average number of markers - (200,400)
Bayesian Information Criterion • Choose the model which maximizes log L -1/2 k log n L – likelihood of the data for a given model k – number of parameters in the model n – sample size Broman (1997) and Broman and Speed (2002) – BIC overestimates QTL number
How to modify BIC ? Mi – i-th linear model (specifies which markers are included in regression) θ = (μ, β1,..., βp, γ1,..., γr, σ) – vector of parameters for Mi fi(θ) – density of the prior distribution for θ π(i) – prior probability of Mi
L(Y|θ) – likelihood of the data given the vector of paramers θ mi(Y) – likelihood of the data given the model Mi P(Mi|Y)π(i)mi(Y) BIC neglects π(i) and uses asymptotic approximation
neglecting π(i) = assigning the same prior probability to all models = assigning high prior probability to the event that there are many regressors Example : 200 markers 200 models with one additive term =19 900 models with one interaction or with two additive terms = 9.05*1058 models with 100 additive terms
Idea: supplement BIC with a more realistic prior distribution π
Choice of π (George and McCulloch, 1993) M – number of markers - number of potential interactions α - the probability that i-th additive term appears in the model ν - the probability that j-th interaction term appears in the model M- model with p additive terms and r interactions π(M)= αpνr(1-α)M-p (1-ν)N-r
Prior distribution on the number of additive terms, p –Binomial (M,α) Prior distribution on the number of interactions, r –Binomial (N,ν) We choose log π(M)=C(M,N,l,u)-p log(l-1)-r log(u-1)
Choice of l and u should depend on the prior knowledge on the number of QTL. Our choice – for the sample size 200 probability of wrongly detecting QTL (when there are none) ≈ 0.05 We keep E(p) and E(r) equal to 2.2 The choice is supported by theoretical bound on type I error based on Bonferoni inequality.
Additional penalty similar to Risk Inflation Criterion of Foster and George (2k log t , where t is the total number of available regressors) and to the modification of BIC proposed by Siegmund (2004).
The criterion adjusts well to the number of available markers • For n = 200 the criterion detects almost all additive QTL with individual h2 =0.13 and interactions with h2 =0.2. • For n = 500 the criterion detects almost all additive QTL with individual h2 =0.06 and interactions with h2 =0.12.
For n=200 and typical values of M this yields values in the range between 0.057 and 0.08.