1 / 41

Measuring Isoform Expression from RNA-Seq data Based on LDA

Measuring Isoform Expression from RNA-Seq data Based on LDA. 刘学军 2012.9.21. Outlines. Background Modeling RNA-Seq data Results. Alternatively spliced isoforms. RNA-Seq data – an example. reference ACGTCCCC 12 ACGTC reads 8 CGTCC reads

brinly
Download Presentation

Measuring Isoform Expression from RNA-Seq data Based on LDA

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Measuring Isoform Expression from RNA-Seq data Based on LDA 刘学军 2012.9.21

  2. Outlines • Background • Modeling RNA-Seq data • Results

  3. Alternatively spliced isoforms

  4. RNA-Seq data – an example reference ACGTCCCC 12 ACGTC reads 8 CGTCC reads 9 GTCCC reads 5 TCCCC reads This gene can be summarized by a sequence of counts 12, 8, 9, 5.

  5. Structure of RNA-seq data

  6. LDA, Latent dirichlet allocation

  7. LDAseq - probe

  8. LDAseq

  9. Convert \theta to expression level • Obtain P(\theta|D) • Normalize counts to sequencing depth and isoform length:

  10. Workflow of LDAseq

  11. Data set 1 3 conditions, each with 2 technical replicates 9370 genes which contain multiple isoforms.

  12. Data set 1 • Histogram of probe number per gene • 72.43 on average

  13. Data set 2 • Two conditions, 8 qRT-PCR validated isoforms

  14. Data set 2 • Histogram of probe number per gene • 72.57 on average

  15. Modelling Multi-response Surfaces for Airfoil Design with Multiple Output Gaussian Process Regression

  16. Gaussian processes • Multiple output GP • MGP in airfoil design

  17. Gaussian Processes • A Gaussian process (GP) is used to describe a distribution over functions. • A GP is a collection of random variables, any finite number of which have a joint Gaussian distribution.

  18. Gaussian Processes The mean function and the covariance function are defined, The GP can be written as

  19. Gaussian Processes The mean function and the covariance function are defined, The GP can be written as

  20. Gaussian Processes The covariance function implies the prior distribution over functions.

  21. Gaussian Processes Prediction with noise-free observations,

  22. Gaussian Processes

  23. Multiple Outputs

  24. Convolution processes for multiple outputs • Consider a set of D output functions where is the input domain. is expressed as

  25. Convolution processes for multiple outputs Consider more than one latent function are taken to be draw from a zero-mean GP with

  26. Convolution processes for multiple outputs If

  27. Convolution processes for multiple outputs If the kernel smoothing function is and the covariance for the latent process is the covariance for the multiple responses is

  28. Convolution processes for multiple outputs Each of the outputs can be corrupted with an independent process, The likelihood is and the prediction is

  29. Correlation between Cl and Cd R^2=0.8525

  30. Pdf of predictive joint distribution • A12

  31. Inverse design • Pressure distribution -> airfoil shape

  32. Acknowledgment • 李蒙 • 闫国启 • 张礼 • 祝青雷

More Related