350 likes | 464 Views
Genome Sciences Centre. BC Cancer Agency, Vancouver, BC, Canada. ALEXA-Seq analysis reveals breast cell type specific mRNA isoforms www.AlexaPlatform.org. Malachi Griffith. 30 Sept. 2010. In most genes, transcript diversity is generated by alternative expression.
E N D
Genome Sciences Centre BC Cancer Agency, Vancouver, BC, Canada ALEXA-Seq analysis reveals breast cell type specific mRNA isoforms www.AlexaPlatform.org Malachi Griffith 30 Sept. 2010
In most genes, transcript diversity is generated by alternative expression Types of alternative expression Gene expression
Transcript variation is important to the study of human disease • Alternative expression generates multiple distinct transcript variants from most human loci • Specific transcript variants may represent useful therapeutic targets or diagnostic markers (Venables, 2006)
Massively parallel RNA sequencing Tissues/Cell Lines Generate cDNA, fragment, size select, add linkers Isolate RNAs Luminal Myoepithelial vHMECs hESCs Sequence ends Map to genome, transcriptome, and predicted exon junctions 263 million paired reads 21 billion bases of sequence Discover isoforms and measure abundance
Summary of features for human: ~4 million total (14% ‘known’) 37k Genes 62k Transcripts 278k exons 2,210k exon junctions 407k alternative exon boundaries 560k intron regions 227k intergenic regions What is an ALEXA-Seq sequence ‘feature’
ALEXA-Seq processing: 19 projects REMC + 18 others 105 libraries (200+ lanes) 3.9 billion paired-end reads 36-mers to 75-mers Data analyzed to date
Expression, differential expression and alternative expression values for 3.8 million features for each library processed Library quality analysis Number of features expressed (above background) Genes, transcripts, exon regions, junctions, etc. Differential gene expression Ranked lists Alternative expression Ranked lists Alternative isoforms involving exon skipping, alternative transcript initiation sites, etc. Known or predicted novel isoforms Candidate peptides Ranked lists Output
Goals Visualization, interpretation, design of validation experiments, distribute results to internal/external collaborators What kinds of questions does ALEXA-Seq allow us to ask/answer? http://www.alexaplatform.org/alexa_seq/Breast/Summary.htm ALEXA-Seq data browser(using REMC analysis as an example)
Library summary Read quality Tag redundancy End bias Mapping rates Signal-to-noise hnRNA & gDNA contamination Features detected Is the RNA-Seq library suitable for alternative expression analysis?
Expression Differential expression Alternative expression Provided for each feature type (gene, exon, junction, etc.) Ranked lists of events What are the most highly expressed genes, exons, etc. in each library?
Candidate genes Each comparison DE or AE events Gains or Losses What are the top DE and AE genes for each tissue comparison?
Candidate features gained in vHMECs vHMECs vs. Luminal CD10
Which exons/junctions and corresponding peptides might be suitable for antibody design?
Candidate peptides gained in vHMECs vHMECs vs. Luminal
CD10 (used to sort myoepithelial cells) Myoepithelial & vHMECs Luminal 422-fold higher in Myoepithelial than Luminal
CD227 (used to sort luminal epithelial cells) CD227 Luminal Myoepithelial CD227
Differential gene expression of CASP14(Caspase 14 gained in vHMECs)
Tissue specific isoforms of CA12 vHMECs Myoepithelial Luminal
FERM domain containing proteins are alternatively expressed * * (FRM6, FRM4A, FRMD4B are AE) (FRMD3, FRMD8 are DE)
Novel isoforms observed only in vHMECs E7-E10 E6-E10
Are novel junctions real? What proportion validate by RT-PCR and Sanger sequencing? Are differential/alternative expression changes observed between tissues accurate? How well do DE values correlate with qPCR? To answer these questions we performed ~400 validations of ALEXA-Seq predictions from a comparison of two cell lines… How reliable are predictions from ALEXA-Seq?
Validation (qualitative) 33 of 189 assays shown. Overall validation rate = 85%
Validation (quantitative) qPCR of 192 exons identified as alternatively expressed by ALEXA-Seq Validation rate = 88%
ALEXA-Seq approach provides comprehensive global transcriptome profile Input: paired-end RNA sequence data Output: expression, differential expression, alternative expression, candidate peptides, etc. Detection of both known and novel isoforms Subset that differ between conditions Predictions are highly accurate 86% validation rate by RT-PCR, qPCR and Sanger sequencing www.AlexaPlatform.org Conclusions
Acknowledgements Griffith M, Griffith OL, Morin RD, Tang MJ, Pugh TJ, Ally A, Asano JK, Chan SY, Li I, McDonald H, Teague K, Zhao Y, Zeng T, Delaney AD, Hirst M, Morin GB, Jones SJM, Tai IT, Marra MA. Alternative expression analysis by RNA sequencing. In review (Nature Methods). Supervisor Marco Marra Committee Joseph Connors Stephane Flibotte Steve Jones Gregg Morin Bioinformatics Obi Griffith Ryan Morin Rodrigo Goya Allen Delaney Gordon Robertson Richard Corbett Sequencing Martin Hirst Thomas Zeng Yongjun Zhao Helen McDonald Laboratory Trevor Pugh Tesa Severson 5-FU resistance Michelle Tang Isabella Tai Marco Marra Multiple Myeloma Rodrigo Goya Marco Marra Neuroblastoma Olena Morozova Marco Marra Morgen Pamela Hoodless Jacquie Schein Inanc Birol Gordon Robertson Shaun Jackman Iressa and Sutent Obi Griffith Steven Jones Lymphoma Ryan Morin Marco Marra