180 likes | 365 Views
Bioinformatics. Lecture 13. Alternative splicing Multiple isoforms Exonic Splicing Enhancers (ESE) and Silencers (ESS) Splice N est. mRNA 1. Gene. mRNA 2. ALTERNATIVE SPLICING. Two or more mRNA molecules can be produced from the same gene.
E N D
Bioinformatics Lecture 13 • Alternative splicing • Multiple isoforms • Exonic Splicing Enhancers (ESE) and Silencers (ESS) • SpliceNest
mRNA 1 Gene mRNA 2 ALTERNATIVE SPLICING Two or more mRNA molecules can be produced from the same gene Number of mRNAs produced by Dscam gene in Drosophila melanogaster exceeds 38, 016 different mature transcripts! The entire Drosophila genome consists of only ~14,000 gene.
One Gene Many Proteins • The classical vision ONE GENEONE PROTEIN is not correct for at least 40-60% of studied mammalian genes • Data show that many variants of mRNA and proteins can be produced from the same gene mRNA1 Protein1 Gene mRNA2 Protein2 mRNA3 Protein3
Gene prediction/identification and alternative splicing • While gene prediction can be done relatively precisely, this may not be sufficient to predict structure of the mature mRNA • Different alternative mRNA isoforms can be produced from the same gene in different tissues and in different time • It means that numerous factors can enhance of silence certain splicing points • Identification of these factors is essential for improving the predictive power of computer programs • It is particularly important to combine experimental and computational studies in order to get progress in this field
Exon skipping/inclusion Alternative 3’ splice sites Alternative 5’ splice sites Mutually exclusive exons Intron retention Constitutive exon Alternatively spliced exon Five common models of mRNA alternative splicing
Models of serine/arginine reach protein action in Exonic Splicing Enhancer (ESE) dependent splicing U2 snRNP – small nuclear ribonucleoprotein; RRM- RNA recognition motif; RS – Arg/Ser enriched domain ESS – Exonic Splicing Silencer; THE MODELS ARE NOT MUTUALY EXCLUISIVE AND MAY HAVE NUMEROUS VARIATIONS
Predictive identification of exonic splicing enhancers (ESE) in human genes • ESEs play important roles in constitutive and alternative splicing. • A computational method, RESCUE-ESE, was developed that predicts which sequences have ESE activity by statistical analysis of exon-intron and splice site composition. • When large data sets of human gene sequences were used, this method identified 10 predicted ESE motifs. Representatives of all 10 motifs were found to display enhancer activity in vivo, whereas point mutants of these sequences exhibited sharply reduced activity. • The motifs identified enable prediction of the splicing phenotypes of exonic mutations in human genes
Consensus RNA motifs for the sites attracting four serine/arginine reach proteins acting as exonic splicing enhancers (ESE)
Expressed Sequence Tags and splice sites • An expressed sequence tag (EST) is a small part of the active part of a gene, made from cDNA, which can be used to fish the rest of the gene out of the chromosome, by matching base pairs with part of the gene. • ESTs and particularly consensus of sequences of clustered ESTs provide useful information about splice variants of genes. • Predicted human mRNA sequences were mapped onto human genomic DNA to compute gene structure and splice variants. The results have been collected in a public database, SpliceNest, with a web based interactive graphical user interface. Similar computations can be done for several other species.
SpliceNest: visualizing gene structure and alternative splicing based on EST clusters • SpliceNest is a tool to explore gene structure, including alternative splicing, based on a mapping on the EST consensus sequences (contigs) from GeneNest to the complete human genome. • SpliceNest is integrated with GeneNest and the SYSTERS protein sequence cluster set in one framework, permitting an overall exploration of the whole sequence space covering protein, mRNA and EST sequences, as well as genomic DNA.
Cluster: A group of ESTs and/or mRNAs that are sufficiently similar to assume that they constitute transcripts from the same gene. Contig: A representation of a (partial) transcript summarized by a consensus sequence, created by multiple alignment of overlapping sequences.