380 likes | 862 Views
Alternative Splicing. Genomic DNA Sequence. Transcription. pre-mRNA. Intron. Intron. Intron. Exon. Exon. Exon. Exon. Exon. RNA Processing. G m. AAAAA. G m. AAAAA. mRNA. mRNA. Alternative Splicing Data Sources are Large and Growing. Curated databases
E N D
Alternative Splicing Genomic DNA Sequence Transcription pre-mRNA Intron Intron Intron Exon Exon Exon Exon Exon RNA Processing Gm AAAAA Gm AAAAA mRNA mRNA
Alternative Splicing Data Sources are Large and Growing Curated databases SWISS-PROT and RefSeq both support annotation of experimentally supported alternative splicing cDNA Sequencing Projects RIKEN sequenced >21000 full length mouse cDNAs Many other projects underway (human, fly, plants,…) Shinagawa et al. (2001) Nature409:685-90 Microarray detection Direct or indirect alternative splicing detection Hu et al. (2001)Genome Res11:1237-45 Yeakley et al. (2002) Nat Biotech20:353-9 Public EST data sources (dbEST) >4.5 million human EST sequences >12 million total EST sequences About 1000 new sequences per day Boguski et al. (1993) Nat Gen4:332-3
Nonsense-Mediated mRNA Decay Genomic DNA pre-mRNA Exon Intron Exon Exon Intron Exon junction complex mRNA Gm AAAAAAAAA Termination codon is on the last exon (not premature) Leeds et al. (1991) Genes Dev5:2303-14 Nagy and Maquat (1998) TIBS23:198-9 Le Hir et al. (2000) Genes & Dev14:1098-1108 Mitchell and Tollervey (2001) Curr Opin Cell Biol13:320-5 Ishigaki et al. (2001) Cell106:607-17 Lykke-Andersen et al. (2001) Science293:1836-9 Kim et al. (2001) EMBO 20:2062-68
Nonsense-Mediated mRNA Decay Interaction between EJC and release factors triggers NMD Decapping and degradation Gm mRNA AAAAAAAAA Termination codon > 50nt before last exon junction (Premature Termination Codon) Leeds et al. (1991) Genes Dev5:2303-14 Nagy and Maquat (1998) TIBS23:198-9 Le Hir et al. (2000) Genes & Dev14:1098-1108 Mitchell and Tollervey (2001) Curr Opin Cell Biol13:320-5 Ishigaki et al. (2001) Cell106:607-17 Lykke-Andersen et al. (2001) Science293:1836-9 Kim et al. (2001) EMBO 20:2062-68
Nonsense-Mediated mRNA Decay Translated normally ORF Gm mRNA AAAAAAAAA Degraded by NMD ORF Gm mRNA AAAAAAAAA >50 nt
1498 of 1500 genes surveyed from fungi, plants, insects and vertebrates obey the PTC rule 4.3% of reviewed RefSeqs have PTCs 34% have start codon after first exon Nagy and Maquat (1998) TIBS23:198-9 NMD is Pervasive “NMD is a critical process in normal cellular developement” Wagner and Lykke-Andersen (2002) J Cell Sci115:3033-8 V(D)J recombination Wang et al. (2002) J Biol Chem277:18489-93 Renders recessive many otherwise dominant mutations Cali and Anderson (1998) Mol Gen Genet260:176-84
Transcriptional Regulation Gene locus transcription pre-mRNA productive splicing productive mRNA RUST translation Protein
Transcriptional Regulation Transcriptional Regulation RUST Gene locus Gene locus transcription pre-mRNA pre-mRNA productive splicing Productive mRNA Productive mRNA
Alternative Splicing Can Yield Isoforms Differentially Subjected to NMD Nucleus Nucleus DNA DNA pre-mRNA pre-mRNA mRNA mRNA Premature termination codon NMD
SC35 Auto-regulation SC35 Locus transcription SC35 pre-mRNA alternative splicing splicing Productive SC35 mRNA translation SC35 protein Sureau et al. (2001) EMBO J20:1785-96
SC35 Locus SC35 SC35 SC35 SC35 SC35 SC35 pre-mRNA SC35 Auto-regulationAlternative splicing coupled with nonsense-mediated decay Productive SC35 mRNA SC35 protein ORF SC35 pre-mRNA Gm AAAAA SC35 mRNA SC35 protein SC35 pre-mRNA SC35 mRNA (with premature termination codon) Gm AAAAA Sureau et al. (2001) EMBO J20:1785-96
EST-inferred human isoforms 0 10000 2000 4000 6000 8000 NMD Candidates 1989 (35 % of 5693) 5693 Alternative isoforms All isoforms, including canonical 8820
Genomic DNA Sequence Pruitt, K.D. et al (2001) NAR29: 137-40 Refseq mRNAs Extract coding regions Lander et al. (2001) Nature409: 860-921 Genomic Contigs Coding Refseqs Association via LocusLink Refseq-Contig Pairs align w/ Spidey ≥98% id, no gaps Wheelan et al. (2001) Gen Res11:1952-7 Construct genes from aligned Refseq exons & intervening genomic introns (overlap choose mRNA w/ largest number of exons) Refseq-coding genes Canonical Splice Forms Exon 1 Exon 2 Exon 3 Exon 4 Refseq-codinggene mRNA
ESTs from dbEST Refseq-coding genes Identification of Alternative Isoforms Boguski et al., (1993) Nat Genet 4, 332-3. Cluster ESTs w/WU-BLAST2 Gish,(2002)(Wash.Univ.) ≥92% id, allow gaps Align ESTs w/ sim4 Florea, et al.,(1998) Gen Res 8, 967-74. Use TAPto infer alternative mRNAs Alternative Isoforms of Refseq-coding genes Kan, et al. (2001) Gen Res 11, 889-900. >92% identity, gaps allowed Aligned EST 5’ end does not indicate reading frame
not integer # codons Alternative Splicing Recruitment of Sequence. Deletion of Sequence. *Frameshift and Truncation.
EST Limitations Single pass sequencing errors Incompletely processed transcripts 3’ end bias Library contamination Thanaraj (1999) NAR27:2627-37
Alternative Splicing EST Analysis From data in Brett et al. (2000) FEBS Lett474:83-6
EST coverage and premature stops For 76% of isoforms with premature stops: RefSeq mRNA Alternatively spliced EST, reading frame 0 ESTs cover a PTC & splice junction downstream In 80% of these isoforms, there is a PTC in every reading frame: Alternatively spliced EST, reading frame 1 Alternatively spliced EST, reading frame 2 Alternative polyadenlyation signals are biased against recovery