500 likes | 584 Views
A General Modeling Framework for Studying Candidate Genes Copy files from f:edwinexample. Why general modeling framework?. Candidate genes for quantitative traits usually “main effect” on mean. Genetic advantage more extensive modeling framework
E N D
A General Modeling Framework for Studying Candidate GenesCopy files from f:\edwin\example
Why general modeling framework? • Candidate genes for quantitative traits usually “main effect” on mean. • Genetic advantage more extensive modeling framework • Some candidate genes may be more likely to be detected • One reason is power e.g. (pleiotropic) easier to detect in multivariate study • Some genes may not work in a simple “main effect” fashion e.g. exert their effects in severely deprived environments only, or influence the sensitivity to environmental fluctuations (variance) • Correct tests? e.g. different genotypic variances in selected samples
Substantive advantage general modeling framework • More extensive picture genetic effects • Shed new light on traditional research questions Continuity, change, and heterotypyComorbidity/pleiotropyComplex traits: Causal mechanisms involving multiple factors • New issues: The interplay between genotypes and environment. Vulnerability, resilience, and protective factorsRisk behavior and the construction of favorable environmentsSensitivity to environmental fluctuations • Instrumental function due to unique properties
Requirements modeling framework • Genetic effects on the means, variances, and relations between variables • Stratification effects on all these components • Nuclear families of various sizes • Interpretable parameterization • Di- and multi-allelic loci, marker haplotypes, multiple loci simultaneously, and parental genotypes • Easy to fit in existing (Mx) software
LISREL based model h(s) = ajk(s) + Bjk(s)h(s) + Gjk(s) + zjk(s) y(s) = nyjk(s) + Lyjk(s)h(s) + eyjk(s) x = nxk + Lxk + exk y subject variables x family variables
Alternative Models • Conditional model h(s) = ajk(s) + Bjk(s)h(s) + Gjk(s)xs + zjk(s) y(s) = njk(s) + Ljk(s)h(s) + Kjk(s)xs + ejk(s) • x-variables is independent subject plus family variables • relax assumption full multivariate normality • curvi or non-linear effects x-variables • Disadvantage: • - Optimization, • - Measurement model x-variables Other modeling frameworks
Partitioning parameter matrices • Most matrices: • a) general matrices that are not subscripted represent overall model in all genotype groups and population strata • b) genetic matrices j represent deviations from the general model caused by locus effects • c) matrices that are subscripted k and represent deviations from the general model caused by population stratification
How? • Example matrix Beta:Causal effects of subject variables on each other Bjk(s) =B + Bj(gsI) + Bk(fI) • Main effects are in B that has dimension nh nh,
Genetic effects in term Bj(gsI) • The ng 1 vector gs contains ng dummy variables coding the genotype (haplotype) of subject s • deviations from B thus maximum = #genotypes - 1 • sets of dummy variables to study multiple loci simultaneously or effects of parental genotypes - Bj = [ B1 | B2 |… | Bng] dimension is nh (ng nh), • where B1 is the nh nh submatrix containing the effects of the first dummy variable, …etc.
Stratification effects in term Bk(fI) • The nf 1 vector f contains the nf dummy variables used to code family types • deviations thus maximum = #family types - 1 • Bk = [ B1 | B2 |… | Bnf] dimension is nh (nf nh), • where B1 is the nh nh submatrix containing the effects of the first dummy variable, …etc. • and I select proper matrix for dummy variable
F1 F2 F3 F4 F5 Subject A Subject B Not informative 2 2 1 0 0 0 0 of stratification 1 1 0 1 0 0 0 0 0 0 0 1 0 0 Informative 2 1 0 0 0 1 0 of stratification 2 0 0 0 0 0 1 1 0 0 0 0 0 0 Sibling pairs
Parent A Parent B Subject A F1 F2 F3 F4 F5 Two Parents, one “child” Not informative 2 2 2 1 0 0 0 0 of stratification 2 0 1 0 1 0 0 0 0 0 0 0 0 1 0 0 Informative 2 1 2 0 0 0 1 0 of stratification 1 0 0 0 1 0 1 1 2 0 0 0 0 1 1 0 0 0 0 1 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0
General interpretation • Genetic effects on: • means are “main” effects • relations between variables are interaction effects • residuals are variance effects
a1 a2 a1(1) a1(2) a2(1)a2(2) 0 b12 b21 0 y1 y2 y1 y2 z1 z2
Maximize log-likelihood function given the observed data by Raw Maximum likelihood where the individual log-likelihoods equal Minus two times the difference between the log likelihoods of two nested models is chi-square distributed with the difference in estimated parameters as the degrees of freedom.
Specification • Most instances selection of matrices • Dimension matrices > boring, errors • Get started Therefore simple program • Batch or questions
MxScript • Data structure • Number of (latent) subject variables? • Number of subjects in largest family? • Number of dummy variables for genotypes? • Matrices to be used • Do the subject variables have causal effects on each other? BETA? • GENETIC: causal relations between subject variables? BETA? • STRATIFICATION: means of subject variables? ALPHA? • File names • Name of file with your data? (DOS name)? • Name of the file for the Mx script? (DOS name)
Structure Mx script • Most instances four groups Group Function Free parameters Starting values 1 General part yes yes 2 Genetic effects yes 3 Stratification effects yes 4 Fit model to data Type from DOS-prompt: MxScript <ENTER> Type from DOS-prompt: MxScript input.dat <ENTER>
Example • Name data file: example.dat • Sibling pairs, no parents • Three genotype groups • Family variables in data file (indicate that you want specify admixture effects) • Starting values: sample drawn from multivariate distribution with means 0 and variances 1.5
General part exercise BMD Intensity Arm Spine Duration Hip
exercise BMD Common pathway? Intensity Arm Independent pathway? Spine Genetic + Stratification effects Duration Hip
Tests Common pathway-Estimate model with genetic and stratification effects on means of second latent variable and test for significance of: • Genetic effects • Stratification effects • Genetic + stratification effect Independent pathway- Estimate model with genetic and stratification effects on means of the indicators of the second latent variable and test for significance of: • Genetic effects • Stratification effects • Genetic + stratification effect
Free elements a Full 2 1 Free [Matrices-End matrices section] Free a 1 1 a 2 1 [After End matrices - free elements] Free a 1 1 to a 2 1 [After End matrices - free range]
Solution Copy files from f:\edwin\solution