120 likes | 281 Views
Finding genes and introns: tRNAscan-SE, fasta, tfasta, sequence motifs, secondary structure …. Finding genes and introns: tRNAscan-SE, fasta, tfasta …. which organism group, which genome, and which expected gene and intron models ? gene models
E N D
Finding genes and introns: tRNAscan-SE, fasta, tfasta, sequence motifs, secondary structure …
Finding genes and introns: tRNAscan-SE, fasta, tfasta … • which organism group, which genome, and which expected gene and intron models ? • gene models • protein genes, translation code, SD motifs, promoters, terminators … • structured RNAs, helices and conserved sequence motifs • genes in pieces, introns, trans-splicing • genes in introns • alternative splicing • intron models, how to find and distinguish them • tRNA and spliceosomal introns (nuclear, eukaryotic), • splice rules and motifs • group I and group II catalytic introns (organelle, bacteria …), • splice rules, motifs, and secondary structure conservation • find exons (best fit of sequence similarity, RNA structure in exons)
Which organism group, which genome, which gene and intron models ? • Examples: • Eukaryotic nuclear: • standard translation code with some exceptions (ciliates …) • spliceosomal and tRNA introns • group I introns in ribosomal RNA genes • trans- and alternative splicing Eukaryotic organelles: • various non-standard translation codes • frequent group I and II introns • genes in introns • genes in pieces, trans-splicing ….. Bacteria: • SD motifs, bacterial promoters and terminators • standard translation code with some exceptions (UGA, trp) • few if any group I and I introns
Mitochondrial example • structured RNAs (tRNAs, rRNAs …) are often little conserved at the primary sequence level • know, or find the translation code for protein genes • don’t expect conserved SD motifs, standard promoters, terminators … • genes in pieces • introns group I and II, and genes in introns
Genes (mitochondrial example) Find genes for structured RNAs • similarity search finds primary sequence conservation (fasta) • search for secondary structure + primary sequence conservation (tRNAscan-SE, RNAmotif, Erpin …) Demo fasta with rRNAs Demo tRNAscan-SE
Protein genes (mitochondrial example) Know, or find, the translation code for protein genes • translate open reading frames, check for existence of adjacent pieces of conserved protein pieces (translate and fasta or blast, or tfasta) • multiple alignment of proteins, infer change of codon meaning from the most conserved positions in alignment (clustalw, muscle, …) Demo translation and fasta Demo tfasta
Introns (mitochondrial example • translate open reading frames, check for non-adjacentpieces of conserved protein pieces (translate and fasta or blast, or tfasta) introns (or genes in pieces) • Identify intron type Demo translation and fasta (demo1.seq)
Introns (mitochondrial example) Identification of intron type • group I introns: • check for 5’ GT... end of exon and ...G intron end • check for P1 pairing, and other structural features • fit predicted exon – intron boundaries • group II introns • check for potential intron start with GNGNG • check for domain V pairing, and other structural features • fit predicted exon - intron boundaries Demo group I intron Demo group II intron