1 / 48

Alternative splicing: A playground of evolution

Alternative splicing: A playground of evolution. Mikhail Gelfand Institute for Information Transmission Problems, RAS. May 2004. Alternative splicing of human (and mouse) genes. Alternative splicing of orthologous human and mouse genes

gaius
Download Presentation

Alternative splicing: A playground of evolution

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Alternative splicing: A playground of evolution Mikhail Gelfand Institute for Information Transmission Problems, RAS May 2004

  2. Alternative splicing of human(and mouse) genes

  3. Alternative splicing of orthologous human and mouse genes • Sequence divergence in alternative and constitutive regions • Evolution of splicing sites • Alternative splicing and protein structure

  4. Data • known alternative splicing • HASDB (human, ESTs+mRNAs) • ASMamDB (mouse, mRNAs+genes) • additional variants • UniGene (human and mouse EST clusters) • complete genes and genomic DNA • GenBank (full-length mouse genes) • human genome

  5. Methods • Direct comparison of EST-derived alternatives difficult because of uneven coverage. • Instead, align alternative isoforms from one species to the genomic DNA of other species. • If alignable (complete exon or part of exon, no significant loss of similarity, no in-frame stops, conserve splicing sites), then conserved. • This is an upper estimate on conservation: an isoform may be non-functional for other reasons (e.g. disruption of regulatory sites). • Cannot analyze skipped exons.

  6. Tools • TBLASTN (initial identification of orthologs: mRNAs against genomic DNA) • BLASTN (human mRNAs against genome) • Pro-EST (spliced alignment, ESTs and mRNA against genomic DNA) • Pro-Frame (spliced alignment, proteins against genomic DNA) • confirmation of orthology • same exon-intron structure • >70% identity over the entire protein length • analysis of conservation of alternative splicing • conservation of exons or parts of exons • conservation of sites

  7. 166 gene pairs Known alternative splicing: human mouse 42 84 40 126 124

  8. Elementary alternatives Cassette exon Alternative donor site Alternative acceptor site Retained intron

  9. Human genes Conserved elementary alternatives: 69% (EST) - 76% (mRNA)Genes with all isoforms conserved: 57 (45%)

  10. Mouse genes Conserved elementary alternatives: 75% (EST) - 83% (mRNA)Genes with all isoforms conserved: 79 (64%)

  11. Real or aberrant non-conserved AS? • 24-31% human vs. 17-25% mouse elementary alternatives are not conserved • 55% human vs 36% mouse genes have at least one non-conserved variant • denser coverage of human genes by ESTs: • pick up rare (tissue- and stage-specific) => younger variants • pick up aberrant (non-functional) variants • 17-24% mRNA-derived elementary alternatives are non-conserved (compared to 25-32% EST-derived ones)

  12. smoothelin human-specific donor-site human common mouse mouse-specific cassette exon

  13. autoimmune regulator retained intron; downstream exons read in two frames human common mouse

  14. Na/K-ATPase gamma subunit (Fxyd2) alternative acceptor site within (inserted) intron human common mouse (deleted) intron

  15. MutS homolog (DNA mismatch repair) dual donor/acceptor site human common

  16. Modrek and Lee, 2003: • conserved skippedexons: • 98% constitutive • 98% major form • 28% minor form • inclusion level: • highly correlated • good predictor of conservation • Minor non-conserved form exons are not aberrant: • minor form exons are supported by multiple ESTs • 28% of minor form exons are upregulated in one specific tissue • 70% of tissue-specific exons are not conserved Thanaraj et al., 2003: 61% (47-86%) alternative splice junctions are conserved

  17. Alternative splicing of orthologous human and mouse genes • Sequence divergence in alternative and constitutive regions • Evolution of splicing sites • Alternative splicing and protein structure

  18. Our preliminary observations: less synonymous, more non-synonymous divergence in alternative exons (human/mouse) => positive selection towards variability “Contrary to our prediction, synonymous divergence between humans and non-human mammals was significantly higher in constitutive exons … Intriguingly, non-synonymous divergence was marginally significantly higher in alternative exons” Iida and Akashi, 2000

  19. 279 proteins from SwissProt+TREMBL with “varsplic” features again, there is some evidence of positive selection towards diversity. This is not due to aberrant ESTs (only protein data are considered).

  20. Alternative splicing of orthologous human and mouse genes • Sequence divergence in alternative and constitutive regions • Evolution of splicing sites • Alternative splicing and protein structure

  21. Alternative splicing in a multigene family: the MAGEA family of cancer/testis specific antigens • A locus at the X chromosome containing eleven recently duplicated genes: two subfamilies of four genes each and three single genes • One protein-coding exon, multiple different 5’-UTR exons • Originates from retroposed spliced mRNA • Mutations create new splicing sites or disrupt existing sites

  22. Phylogenetic trees (protein-coding and upstream regions)

  23. Expression data • pooled by organ/tissue; maximum recorded expression level retained • no data for MAGEA10; MAGEA3 and MAGEA6 likely non-distinguishable • green: normal; brown: cancer

  24. MAGEA1 1 1b MAGEA5(normal placenta) 1 MAGEA3 1 1a 1 MAGEA6(testis, brain/medulla, cancer) 1 1a Simple genes with alternatives in exon 1 (MAGEA1, MAGEA5, MAGEA3/6)

  25. MAGEA2 6 5 4d 1 5 4d 2a 1 5 4d 1 4d 1 1 4d 1 1 6-5 MAGEA12 1-0 6 4 1-0 Two more genes of subfamily B: multiple isoforms of MAGEA2 and a deletion in MAGEA12

  26. MAGEA8 1 2-1 3 1 MAGEA9(testis, no cancers) 2 1 2 1 MAGEA10 4a 1 4a 2 1 4c 2d 1 MAGEA11 4b 1 1 Isoforms of subfamily A

  27. MAGEA4 (testis and cancers; brain/medulla;also common 3’ ESTs in placenta) 1 1 1 1 1 1 1 1 1 Multiple duplications of the initial exon in MAGEA4

  28. exon in intergenic space initial exon of MAGEA10 exons of MAGEA5 1 exons of BC013171 exon in intergenic space initial exon of MAGEA12 Chimaeric mRNAs (splicing of readthrough transcripts)

  29. Other examples: • galactose-1-phosphate uridylyltransferase + interleukin-11 receptor alpha chain (Magrangeas et al., 1998) • P2Y11 [receptor] + SSF1 [nuclear protein] (Communi et al., 2001) • PrP [Prion protein] + Dpl [prion-like protein Doppel] (Moore et al., 1999) • cytochrome P450 3A: CYP3A7 + two exons of a downstream pseudogene read in a different frame (Finta & Zaphiropoulos, 2000) • HHLA1 + OC90 [otoconin-90] (Kowalski et al., 1999) • TRAX [translin-associated factor X] + DISC1 [candidate schizophirenia gene] (Millar et al., 2000) • Kua + UEV1 [polyubiquination coeffector] (Thomson et al., 2000) • FR + GAP [Rho GTPase activating protein] (Romani et al., 2003) - ? • methyonyl tRNA synthetase + advillin (Romani et al., 2003) - ?

  30. Birth of donor sites(new GT in alternative intial exon 5)

  31. Birth of an acceptor site (new AG and polyY tract in MAGEA8-specific cassette exon 3)

  32. Birth of an alternative donor site(enhanced match to the consensus (AG)in cassette exon 2)

  33. Birth of an alternative acceptor site(enhanced polyY tract in cassette exon 4)

  34. Disactivation of a donor site and birth of a new site(non-consensus G and new GTin major-isoform cassette exon 4)

  35. Series of mutations sequentially activating downstream acceptor sites(mutated AG in exon 4)

  36. Alternative splicing of orthologous human and mouse genes • Sequence divergence in alternative and constitutive regions • Evolution of splicing sites • Alternative splicing and protein structure

  37. Data • Alternatively spliced genes (proteins) from SwissProt • human • mouse • Protein structures from PDB • Domains from InterPro • SMART • Pfam • Prosite • etc.

  38. Alternative splicing avoids disrupting domains (and non-domain units) Control: fix the domain structure; randomly place alternative regions

  39. … and this is not simply a consequence of the (disputed) exon-domain correlation

  40. Positive selection towards domain shuffling (not simply avoidance of disrupting domains)

  41. Short (<50 aa) alternative splicing events within domains target protein functional sites c) FT positions affected FT positions unaffected Prosite patterns affected Prosite patterns unaffected Expected Observed

  42. An attempt of integration • AS is often young (as opposed to degenerating) • young AS isoforms are often minor and tissue-specific • … but still functional • although unique isoforms may be result of aberrant splicing • AS regions show evidence for positive selection • excess damaging SNPs • excess non-synonymous codon substitutions • MAGEA - not aberrant, because explainable by effects of mutations

  43. What to do • Each isoform (alternative region) can be characterized: • by conservation (between genomes) • if conserved, by selection (positive vs negative) • human-mouse, also add rat • pattern of SNPs (synonymous, benign, damaging) • tissue-specificity • in particular, whether it is cancer-specific • degree of inclusion (major/minor) • functionality (for isoforms) • whether it generates a frameshift • how bad it is (the distance between the stop-codon and the last exon-exon junction)

  44. What to expect (hypotheses) • Cancer-specific isoforms will be less functional and more often non-conserved • Non-conserved isoforms will contain a larger fraction of non-functional isoforms; and this may influence evolutionary conclusions • Still, after removal of non-functional isoforms, one should see positive selection in alternative regions (more non-synonymous substitutions compared to constant regions etc.); especially in tissue-specific ones.

  45. Plans • careful and detailed analysis of human-mouse-(rat)-((dog)) AS isoforms (human and mouse ESTs) • conservation of AS regulatory sites • mosquito-drosophila • more families of paralogs; add mouse data • AS of transcription factors and receptors

  46. Acknowledgements • Discussions • Vsevolod Makeev (GosNIIGenetika) • Eugene Koonin (NCBI) • Igor Rogozin (NCBI) • Dmitry Petrov (Stanford) • Support • Ludwig Institute of Cancer Research • Howard Hughes Medical Institute

  47. Authors • Andrei Mironov (GosNIIGenetika) – spliced alignment • Shamil Sunyaev (EMBL, now Harvard University Medical School) – protein structure • Vasily Ramensky (Institute of Molecular Biology) – SNPs • Irena Artamonova (Institute of Bioorganic Chemistry) – human/mouse comparison, MAGEA family • Dmitry Malko (GosNIIGenetika) – mosquito/drosophila comparison • Eugenia Kriventseva (EBI, now BASF) – protein structure • Ramil Nurtdinov (Moscow State University) – human/mouse comparison • Ekaterina Ermakova (Moscow State University) – evolution of alternative/constitutive regions

  48. References Nurtdinov RN, Artamonova II, Mironov AA, Gelfand MS (2003) Low conservation of alternative splicing patterns in the human and mouse genomes. Human Molecular Genetics12: 1313-1320. Kriventseva EV, Koch I, Apweiler R, Vingron M, Bork P, Gelfand MS, Sunyaev S. (2003) Increase of functional diversity by alternative splicing. Trends in Genetics19: 124-128. Brudno M, Gelfand MS, Spengler S, Zorn M, Dubchak I, Conboy JG (2001) Computational analysis of candidate intron regulatory elements for tissue-specific alternative pre-mRNA splicing. Nucleic Acids Research29: 2338-2348. Dralyuk I, Brudno M, Gelfand MS, Zorn M, Dubchak I (2000) ASDB: database of alternatively spliced genes. Nucleic Acids Research28: 296-297. Mironov AA, Fickett JW, Gelfand MS (1999). Frequent alternative splicing of human genes. Genome Research9: 1288-1293.

More Related