1 / 19

Sylvia Richardson sylvia.richardson@mrc-bsu.cam.ac.uk MRC Biostatistics Unit and University of Cambridge

Application of Bayesian methods in genomics. Sylvia Richardson sylvia.richardson@mrc-bsu.cam.ac.uk MRC Biostatistics Unit and University of Cambridge. MRC Biostatistics Unit Research Themes. Statistical Genomics. Background.

mignon
Download Presentation

Sylvia Richardson sylvia.richardson@mrc-bsu.cam.ac.uk MRC Biostatistics Unit and University of Cambridge

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Application of Bayesian methods in genomics Sylvia Richardson sylvia.richardson@mrc-bsu.cam.ac.uk MRC Biostatistics Unit and University of Cambridge

  2. MRC Biostatistics Unit Research Themes

  3. Statistical Genomics

  4. Background • In integrative genomics, many questions of interest involve linking a large set of p predictors, e.g. SNP or gene expression, to q multiple responses, e.g. disease characteristics or biological phenotypes, using a moderate number of samples n • For example, interest in finding genetic markers associated with lipid metabolism, or genetic control points regulating the process of transcription • Statistical framework: Sparse Bayesian regression with model selection component

  5. Sparse regression computations • Model exploration has to search a vast space of possible models when p is large • Our implementation: GUESS uses Evolutionary Monte Carlo techniques (Evolutionary Stochastic Search, ESS, running several MCMC chains in parallel) and GPU computing • Subset selection for single and multiple response phenotypes • An R Package R2GUESS, which calls a C++ code, soon to be released. R2GUESS runs the ESS algorithm and performs the complex post-processing of the output.

  6. Application of GUESS to the genetic association analysis of lipid phenotypes

  7. Analysis strategy

  8. Analysis strategy and results for groups of correlated phenotypes Altogether 16 markers were found associated with different groups of phenotypes, in good correspondence with large GWAS analyses

  9. Comparison with other methods

  10. Extension of sparse framework to hierarchically related regressions • Linking a large number q of responses to a large set of p predictors • Motivation: genetic regulation of expression

  11. Hierarchical structure • We now have a (pxq) matrix Γ of selection indicators • Want to borrow information across the responses to highlight predictors common to several responses Adopt parametrisation of Ω involving a common parameter to each column, while still controlling sparsity in each regression.

  12. Mouse gene expression case study Discovery of 6 “hot spots”, i.e. genetic markers associated to a substantial proportion of transcripts.

  13. Summary • Focus on modelling strategies where multidimensional and multivariate aspects are fully exploited • Perform information synthesis: different sources of data, hierarchical structures, prior models informed by external information, … • Allow for model uncertainty and compare with alternative analytical strategies • Embed models and methods within state of the art Bayesian computations.

More Related