1 / 20

RIP – Transcript Expression Levels

RIP – Transcript Expression Levels. Outline. RNA Immuno-Precipitation (RIP) NGS on RIP & its alternatives Alternate splicing Transcription as a graph Distribution of tags in exons Pipeline on RIP-seq dataset. RNA Immuno-Precipitation (RIP).

thanh
Download Presentation

RIP – Transcript Expression Levels

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. RIP – Transcript Expression Levels

  2. Outline • RNA Immuno-Precipitation (RIP) • NGS on RIP & its alternatives • Alternate splicing • Transcription as a graph • Distribution of tags in exons • Pipeline on RIP-seq dataset

  3. RNA Immuno-Precipitation (RIP) • Global identification of multiple RNA targets of RNA-Binding Proteins (RBPs) • Identify proteins associated with RNAs in RNP complexes • Identify subsets of RNAs that are functionally-related and potentially co-regulated

  4. How is RIP performed?

  5. Sequencing on RIP • RIP-Chip • Noisy • May miss out rare transcripts • RIP-RT-PCR • PCR introduces mutations • RIP tilting-arrays • Very expensive • Too sensitive to ‘transcriptional noise’

  6. NGS on RIP • RIP-Seq • A more complete and unbiased assessment of the global population of RNAs associated with a RNP complex • Minimize sequencing bias and high backgrounds known to the previously-mentioned methods

  7. Alternate Splicing • A simple example • Regions with the numbers of reads • Exon1: chr1:13113087-13113138(5,1); • Exon2: chr1:13113270-13113299(2,0); • Exon3: chr1:13113312-13113343(3,0); • Splice reads • chr1,13113107,13113138,chr1,13113312,13113343,3.0; • chr1,13113087,13113116,chr1,13113270,13113299,2.0; Exon1(5) Exon2(2) Exon3(3) Exon_Num(Tags)

  8. Alternate Splicing • A less ideal example • Regions with the numbers of reads • Exon1: chr4:145149018-145149181(29,0); • Exon2: chr4:145149265-145149402(8,0); • Exon3: chr4:146893298-146895275(116,1); • Splice reads • chr4,145149059,145149088,chr4,146894246,146894276,3.0; • chr4,145149374,145149402,chr4,146894470,146894498,2.0; Exon1(29) Exon2(8) Exon3(116)

  9. Transcription as a Graph • From RNA-seq data, check the overlap of the tags • If a region has more than one tag, we call it an enriched region • Nodes • Using the splice reads, we will connect the enriched regions • Edges

  10. Transcription as a Graph • Represent transcriptome in a topologically sorted acyclic graph • Some Observed Errors (RME005) • Out-of-range edges in graphs • Self-looping nodes • Default action: Ignore them

  11. Distribution of Tags in Exons • rQuant – Courtesy of Regina Bohnert (FML, Tubingen)

  12. RNA-seq RIP-seq • The previous results are from RNA-seq • Will we have similar observations on RIP-seq datasets? • And possibly link the observations to transcription expression levels in transcriptome

  13. Pipeline on RIP-seq dataset • Dataset RME005 is used • Use TopHat / Eland to map RNA back to genome • Generate transcription-graphs for each transcript with alternate splicing • Express the paths of all transcriptions in the graph using a set of linear equations • Use R to solve the linear equations

  14. An example from RME005 • There are two transcripts • Path1: Exon1 -> Exon2 -> Exon4 • Path2: Exon1 -> Exon3 -> Exon4 • Exon1 - Exon4 have length L1 - L4, and have reads with number N1 - N4 • S1-S4 are the numbers of splice reads S3 S1 N3 Exon4 N4 N1 N2 Exon1 Exon2 Exon3 S4 S2

  15. Assumptions • The transcript expression levels are: • Path1: x1 • Path2: x2 • The read length = constant • The reads are uniformly sampled from the transcripts • Use density of reads instead of read_coverage • Differentiate reads on both long & short exons

  16. Equations for linear programming • Objective function: minimize the sum of d_i • Constraints • N1/L1 = x1 + x2 + d1 - d2 • S1/R = x1 + d3 - d4 • N2/L2 = x1 + d5 - d6 • S2/R = x1 + d7 - d8 • S3/R = x2 + d9 - d10 • N3/L3 = x2 + d11 - d12 • S4/R = x2 + d13 - d14 • N4/L4 = x1 + x2 + d15 - d16 • x1 , x2 >= 0 • d_i >= 0 • The solution should be the values of x1, x2 and all d_i S3 S1 N1 N2 N3 N4 S4 S2

  17. Another problem • An implicit assumption on enriched regions in RME005 • RIP is known to be ~10% efficient • Noise will overwhelm true RNP-targets • Should use total-RNA as control dataset • True-positive regions from RIP should be relatively enriched with tags than

  18. Handling the assumption • Obtain RNA-seq from the same source of transcriptome • Directly compare both RNA-seq and RIP-seq data • RIP-chip discriminate enriched region with >4-fold than RNA-chip data • Maybe 4-fold is the magic number ? • Current tag distribution observed by Dr Li Guoliang • Non-uniform as opposed to what rQuant has observed on RNA-seq

  19. Q&A

More Related