190 likes | 336 Views
Analysis of Experiment E-GEOD-29989: Alternative Splicing of Exons in Human Hematopoietic Stem cells. The Johns Hopkins University Advanced Genomics & Genetics Analysis AS410.713.81.SU12 Phillip Woolwine. Outline. Alternative splicing of exons (ASE)
E N D
Analysis of Experiment E-GEOD-29989: Alternative Splicing of Exons in Human Hematopoietic Stem cells The Johns Hopkins University Advanced Genomics & Genetics Analysis AS410.713.81.SU12 Phillip Woolwine
Outline • Alternative splicing of exons (ASE) • ASE probed w/ AffymetrixGeneChip Exon array • Data analyzed w/ Affymetrix Power Tools (APT), R & Bioconductor • ASE during lineage-specific hematopoietic differentiation
Background • ASE leads to mRNAs that can have similar or different functions/products and is a basis for functional diversity in gene expression • Includes exon skipping, mutually exclusive exons, alternative 5' donor sites, alternative 3' donor sites, and/or intron retention • ASE is believed to be a major player in lineage-specific differentiation of blood cells • Aberrant ASE can lead to leukemias, lymphomas, etc • Understanding of ASE events and the exome profile will benefit understanding of disease pathogenesis and aid in therapies
Methods • Data retrieved from Experiment E-GEOD-29989 by Lui et al (2011) • AffymetrixGeneChip Human Exon 1.0 ST array • Transcriptional profile of lineage-specific differentiation of CD34 cells into • Erythropoietic (E), Granulopoietic (G), and Megakaryopoietic (M) cells • Data normalized with APT 1.14.2 • PCA outlier analysis in R and Bioconductor • Filtered by DABG p-val > 0.05 in >50% of classes • T-test used to determine exon enrichment/depletion by p-val & fold change (FC) • Evidence for alternative splicing verified in Ensembl • Top exon probes mapped to genes and differential expression plotted • green plots are CD34, red plots are lineage-specific • DAVID pathway analysis
Results PCA of exon probes • Exon arrays are clustered along cell lineages; no outliers • G & E lineage more similar than M lineage
Results Scree plot of exon probes • Approx. 70% of variability is explained in the first two eigenvectors
Results Filtering and Testing Correction • Filter on DABG of all class types removed ~83,000 low intensity probes • mRNA filters may be more sensitive in some cases (table1, Della Beffa et al, 2008) • Other filtering included using core probesets in APT (core,ps, core.mps) • 197,245 probes remained after statistical tests • Potential for false positives (FP) though multiple testing correction not performed • At p-val <0.01 & FC > |1.5|there were 6413 in E, 3316 in G, 9638 in M; possible high FP • It can be argued that FWER is too conservative for the high-dimensionality of exon data ; • the tests may not necessarily be independent nor uniform in a non-significant way and FDR may not be appropriate (Della Beffa et al, 2008) • Proper pre-filtering of probes and true splice sites is a better strategy to limit FP
Results One Significant ASE Transcript Cluster Common to All 3 Lineages • Dimensionality reduction at p-val < 0.00001 & FC >|2| • IGHA2 immunoglobulin heavy constant alpha 2
Results Significant ASE in Erythropoieticvs CD34 • Dimensionality reduction at p-val < 0.00001 & FC >|2| • 20 unique transcript clusters ordered by p-val
Results Significant ASE in Erythropoieticvs CD34 Genes and Pathways • DAVID Functional Annotation reveals enrichment for alternative splicing
Results Significant ASE in Granulopoieticvs CD34 • Dimensionality reduction at p-val < 0.00001 & FC >|2| • 9 unique transcript clusters ordered by p-val
Results Significant ASE in Granulopoieticvs CD34 Genes & Pathways • DAVID Functional Annotation reveals significant enrichment for signaling • Clusters include alternative splicing
Results Significant ASE in Megakaryopoieticvs CD34 • Dimensionality reduction at p-val < 0.00001 & FC >|2| • 37 unique transcript clusters ordered by p-val
Results Significant ASE in Megakaryopoieticvs CD34 Genes & Pathways • DAVID Functional Annotation reveals enrichment for alternative splicing & signaling (Several categories not shown but include those for immune system development)
Results Top Significantly Upregulated ASE in Erythropoietic • P-val < 0.01 & FC >2 in Erythropoietic; P-val < 0.01 & FC < 1.5 in G & M lineages • Top upregulated exon probe 2527682; cluster id 2527672; gene PKND Significantly downregulated in Megakaryopoietic lineage
Results Top Significantly Upregulated ASE in Granulopoietic • P-val < 0.01 & FC >2 in Granulopoietic; P-val < 0.01 & FC < 1.5 in E & M lineages • Top upregulated exon probe 4016430, 4016431; cluster id 4016428; gene BEX2 Significantly reduced expression versus CD34 in Erythropoietic lineage Significantly reduced expression versus CD34 in Megakaryopoietic lineage
Results Top Significantly Upregulated ASE in Megakaryopoietic • P-val < 0.01 & FC >2 in Megakaryopoietic; P-val < 0.01 & FC < 1.5 in E & G lineages • Top upregulated exon probe 3275248; cluster id 3275132; gene GDI2 downregulated downregulated upregulated
Discussion • ASE occurs during lineage-specific hematopoietic differentiation of CD34 cells into Erythropoietic, Granulopoietic, and Megakaryopoietic cells • Pathway terms are significantly enriched in alternative splicing and signaling, including those for immune system development, consistent with known biology • Relatively increased ASE in megakaryopoietic differentiation may suggest increased transcriptional complexity during development • Comparison to original research results by Lui et al (2011) share a few top hits and similar pathway enrichment • However, most top genes were not identical and is probably due to their use of the ExonSVD model for statistical assessment of exon enrichment/depletion • High number of significant hits at p< 0.01 may indicate high FDR and may warrant further filtering and dimensionality reduction • May be interesting to combine MiDAS and Rank Product for testing and correction
References • Della Beffa et al (2008) Dissecting an alternative splicing analysis workflow for GeneChip Exon 1.0 ST Affymetrix arrays. BMC Genomics 9:571, PMID:19040723 • EBI (2012) Ensembl Genome Browser, release 68.<online> Available at: < http://useast.ensembl.org/index.html >[Access 8/20/12] • Higgs, B (2012) Advanced Genomics & Genetics Analysis, Lecture 2: Analysis and interpretation of splice variants. Johns Hopkins University, unpublished • Lui et al (2011) Transcriptome Profiling and Sequencing of differentiated Human Hematopoietic Stem cells Reveal Lineage Specific Expression and Alternative Splicing of Genes, Physiol Genomics 43(20):1117-34, PMID: 21828245 • NIAID/NIH (2012) DAVID Bioinformatics Resources 6.7: Functional Annotation Tool. <online> Available at:< http://david.abcc.ncifcrf.gov/tools.jsp >[Accessed 8/20/12]