1 / 34

Associations to Quantitative Trait Network and Analysis of Asthma Data

Associations to Quantitative Trait Network and Analysis of Asthma Data. Seyoung Kim and Eric P. Xing {sssykim, epxing}@cs.cmu.edu Machine Learning Dept. Carnegie Mellon University. 10/30/2009. Genome Informatics 2009 @ Cold Sprint Harbor Lab. Association Analysis of Single Trait.

becca
Download Presentation

Associations to Quantitative Trait Network and Analysis of Asthma Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Associations to Quantitative Trait Network and Analysis of Asthma Data Seyoung Kim and Eric P. Xing {sssykim, epxing}@cs.cmu.edu Machine Learning Dept. Carnegie Mellon University 10/30/2009 Genome Informatics 2009 @ Cold Sprint Harbor Lab

  2. Association Analysis of Single Trait a univariate phenotype: i.e., disease/control, gene expression level causal SNP

  3. Association Analysis of Quantitative Trait Network a univariate phenotype: i.e., disease/control, gene expression level causal SNP

  4. TCGACGTTTTACTGTACAATT Genetic Association for Asthma Clinical Traits Subnetworks for lung physiology Subnetwork for quality of life

  5. TCGACGTTTTACTGTACAATT Expression QTL Mapping Microarray experiments Gene correlation network with gene modules

  6. Motivation : Multiple-trait Association • Traditional approach: analyze one phenotype at a time • Our approach: consider multiple related phenotypes jointly and incorporate correlation structure in the phenotypes • Graph-guided fused lasso (Kim & Xing, PLoS Genetics, 2009)

  7. Multivariate Regression for Single-Trait Association Analysis Allergy Symptom Genotype Association Strength x 2.1 = T G A A C C A T G A A G T A y x β X =

  8. Multivariate Regression for Single-Trait Association Analysis Allergy Symptom Genotype Association Strength x 2.1 = T G A A C C A T G A A G T A argmin (y – Xβ) (y – Xβ) β Many non-zero associations: how to pick the threshold?

  9. Lasso for Reducing False Positives (Tibshirani, 1996) Allergy Symptom Genotype Association Strength x 2.1 = T G A A C C A T G A A G T A Lasso Penalty for sparsity argmin (y – Xβ) (y – Xβ) λ | βj | + β Many zero associations (sparse results), but what if there are multiple related traits?

  10. Multivariate Regression for Multiple-Trait Association Analysis Genotype Association Strength Allergy for roaches FEV FEF Allergy for cats Allergy in spring x (3.4, 1.5, 2.1, 0.9, 1.8) = Lung physiology Allergy T G A A C C A T G A A G T A ? argmin (y – Xβ) (y – Xβ) λ | βj | + β How to combine information across multiple traits to increase the power?

  11. Multivariate Regression for Multiple-Trait Association Analysis Genotype Association Strength Allergy for roaches FEV FEF Allergy for cats Allergy in spring x (3.4, 1.5, 2.1, 0.9, 1.8) = Lung physiology Allergy T G A A C C A T G A A G T A argmin (y – Xβ) (y – Xβ) λ | βj | + β We introduce graph-guided fusion penalty +

  12. Multivariate Regression for Multiple-Trait Association Analysis Genotype Association Strength Allergy for roaches FEV FEF Allergy for cats Allergy in spring x (3.4, 1.5, 2.1, 0.9, 1.8) = Lung physiology Allergy T G A A C C A T G A A G T A argmin (y – Xβ) (y – Xβ) λ | βj | + β +

  13. Fusion Penalty • Fusion Penalty: | βjk - βjm | • If two traits are correlated (connected in the trait network), they are likely to share a similar association strength SNP j ACGTTTTACTGTACAATT Association strength between SNP jand Traitm:βjm Association strength between SNPjand Traitk:βjk Trait m Trait k

  14. Graph-Constrained Fused Lasso • Fusion effect propagates to the entire network • Association between SNPs and subnetworks of traits Overall effect ACGTTTTACTGTACAATT

  15. Graph-Weighted Fused Lasso • Subnetwork structure is embedded as a densely connected nodes with large edge weights • Edges with small weights are effectively ignored Overall effect ACGTTTTACTGTACAATT

  16. Previous Works vs. Our Approach

  17. Asthma Dataset • 543 severe asthma patients from the Severe Asthma Research Program (SARP) • Genotypes : 34 SNPs in IL-4R gene • 40kb region of chromosome 16 • Impute missing genotypes with PHASE (Li and Stephens, 2003) • Traits : 53 asthma-related clinical traits • Quality of Life: emotion, environment, activity, symptom • Family history: number of siblings with allergy, does the father has asthma? • Asthma symptoms: Chest tightness, wheeziness

  18. Asthma Trait Network Trait Correlation Structure Trait Network Threshold at 0.7 Traits are reordered according to hierarchical clustering results

  19. Asthma Trait Network Subnetwork for Asthma symptoms Phenotype Correlation Structure Subnetwork for lung physiology Subnetwork for quality of life

  20. Results from Single-SNP/Trait Test • Lung physiology-related traits I • Baseline FEV1 predicted value: MPVLung • Pre FEF 25-75 predicted value • Average nitric oxide value: online • Body Mass Index • Postbronchodilation FEV1, liters: Spirometry • Baseline FEV1 % predicted: Spirometry • Baseline predrug FEV1, % predicted • Baseline predrug FEV1, % predicted Phenotypes Phenotypes • Q551R SNP • Codes for amino-acid changes in the intracellular signaling portion of the receptor • Exon 12 Trait Correlation Matrix Trait Network SNPs Single-Marker Single-Trait Test Permutation test α = 0.05 Permutation test α = 0.01

  21. Comparison of Gflasso with Others • Lung physiology-related traits I • Baseline FEV1 predicted value: MPVLung • Pre FEF 25-75 predicted value • Average nitric oxide value: online • Body Mass Index • Postbronchodilation FEV1, liters: Spirometry • Baseline FEV1 % predicted: Spirometry • Baseline predrug FEV1, % predicted • Baseline predrug FEV1, % predicted Phenotypes Phenotypes • Q551R SNP • Codes for amino-acid changes in the intracellular signaling portion of the receptor • Exon 12 Trait Correlation Matrix Trait Network SNPs ? ? Single-Marker Single-Trait Test Graph-weighted Fused Lasso Graph-constrained Fused Lasso Lasso

  22. Software for Genome-Phenome Association SNP File Gene module 1 Gene Expression File 10600000 10700000 Chromosome 12

  23. Future Work: Correlated Genome-Transcriptome-Phenome Association Analysis • GFlasso • Tree lasso • Population lasso Phenome Structure Genome Structure Linkage Disequilibrium Three-way Association! Population Structure Clinical Traits Transcriptome Structure • Bi-clustering • GFlasso • Tree lasso • Population lasso Gene Modules

  24. Thanks! • Software is available at http://sailing.cs.cmu.edu/gflasso • Acknowledgements: Ross Curtis, Kyung-Ah Sohn, Sally Wenzel Funding:

  25. Reference • Tibshirani R (1996) Regression shrinkage and selection via the lasso. Journal of Royal Statistical Society, Series B 58:267–288. • Weller J, Wiggans G, Vanraden P, Ron M (1996) Application of a canonical transformation to detection of quantitative trait loci with the aid of genetic markers in a multi-trait experiment. Theoretical and Applied Genetics 92:998–1002. • Mangin B, Thoquet B, Grimsley N (1998) Pleiotropic QTL analysis. Biometrics 54:89–99. • Chen Y, Zhu J, Lum P, Yang X, Pinto S, et al. (2008) Variations in DNA elucidate molecular networks that cause disease. Nature 452:429–35. • Lee SI, Dudley A, Drubin D, Silver P, Krogan N, et al. (2009) Learning a prior on regulatory potential from eQTL data. PLoS Genetics 5:e1000358. • Emilsson V, Thorleifsson G, Zhang B, Leonardson A, Zink F, et al. (2008) Genetics of gene expression and its effect on disease. Nature 452:423–28.

  26. Multiple-Trait Association: Dependencies in Phenome Association with Phenome Traditional Approach causal SNP ACGTTTTACTGTACAATT ACGTTTTACTGTACAATT a univariate phenotype: i.e., disease/control, gene expression level Multivariate complex syndrome (e.g., asthma) age at onset, history of eczema genome-wide expression profile

  27. Multiple-trait Association: Graph-Constrained Fused Lasso Step 1: Thresholded correlation graph of phenotypes Step 2: Graph-constrained fused lasso ACGTTTTACTGTACAATT Fusion Lasso Penalty Graph-constrained fusion penalty

  28. Multiple-trait Association: Graph-Weighted Fused Lasso Step 1: Thresholded correlation graph of phenotypes with weights Step 2: Graph-weighted fused lasso ACGTTTTACTGTACAATT Weighted Fusion Lasso Penalty Graph-constrained fusion penalty

  29. Estimating Parameters (Association Strength) • Quadratic programming formulation • Graph-constrained fused lasso • Graph-weighted fused lasso • Many publicly available software packages for solving convex optimization problems can be used

  30. Simulation Results Phenotypes Trait Correlation Matrix Thresholded Trait Correlation Network • 50 SNPs taken from HapMap chromosome 7, CEU population • 10 traits SNPs True Regression Coefficients Single SNP-Single Trait Test Graph-constrained Fused Lasso Graph-weighted Fused Lasso Ridge Regression Lasso

  31. Results from Association Phenotypes Phenotypes • Lung physiology-related traits II • Percent difference in FEV1: Spirometry • Post FEF 25-75 value • Postbronchodilation FEV1, % pred: Spirometry • Baseline FEV1, liters: Spirometry • Baseline predrug FEV1, liters • Maximum FEV1, liters: MPVLung • Baseline predrug FEV1, liters Trait Correlation Matrix Trait Network SNPs ? ? Single-Marker Single-Trait Test Graph-weighted Fused Lasso Graph-constrained Fused Lasso Lasso

  32. Linkage Disequilibrium Structure in IL-4R gene SNP rs3024622 SNP rs3024660 r2 =0.07 r2 =0.64 SNP Q551R

  33. Computation Time

  34. Conclusions • Summary • Dependencies in phenome: Graph-guided fused lasso framework incorporates correlation information among traits to detect pleiotropic effect of genotypic variations. • Analysis of the asthma dataset suggests the effectiveness of the method • Future Work • Dependencies in genome?: Poster Q06 (This evening) • Dependencies in both genome and phenome • Learn the trait correlation network and association strengths jointly • Availability: http://www.sailing.cs.cmu.edu/

More Related