330 likes | 454 Views
Yan Zhou & Mohamed Tikah M. BIN6002 Summer, 2005. Genome annotation searching for group I and group II introns. Assembling a mitochondrial Genome. Assembled 1002 reads using phred/phrap/consed into one contiguous sequence of 69586 base pairs. Sequencing output was of high quality.
E N D
Yan Zhou & Mohamed Tikah M. BIN6002 Summer, 2005 Genome annotation searching for group I and group II introns
Assembling a mitochondrial Genome • Assembled 1002 reads using phred/phrap/consed into one contiguous sequence of 69586 base pairs. • Sequencing output was of high quality. • Software tools did 99% of the job.
Genome annotation • Blasting the assembled sequence made it obvious that the sequence has a high similarity with the mitochondrial genome of (the previously annotated) Reclinomonas americana nz strain ATTC 50395. • Further (exact) alignments allowed the identification of (at least) 95 genes (26 of them encode for tRNAs)
Genome annotation • Freshwater protozoan. • Belongs to the protozoan group called ‘Jakobids’. • Gene rich mitochondrial genome. Thought to most resemble the ancestral mitochondrion. GOBASE (2005)
Genome annotation • nadn, coxn, rnl, rns, rrn5 … were all identified. • In most cases, we obtained a high similarity score (over 85% identity) between the coding sequences from the two genomes. • Order of genes is (so far) conserved. • In one case the overlapping between adjacent genes is also conserved.
Genome annotation • Using tRNAscan-SE identified 25 cove confirmed tRNAs candidates. • Folding some of this candidates using mfold gave the expected tRNA secondary structure. • 1 tRNAW was not detected by tRNAscan-SE
Genome annotation • Is the group II intron r. americana nz also present? • Probably yes. • Allowed the identification of the 26th tRNA tRNAW(cca)
Genome annotation • 70% similar to the group II intron in r. americana nz. • Identified ‘GUGCG’ in 5’ end and other ‘consensus sequences • Doesn’t contain an ORF N. Toor et al (2001)
Genome annotation • RNAse P is an RNAse that cleaves an extra sequence of tRNA (it can act alone as a catalyst in vitro). • rnpB encodes for RNAse P is found in r. americana nz and also in the new genome. • Secondary structure resolved. The new RNAse P has probably a similar secondary structure.
Genome annotation QuickTime™ et un décompresseur TIFF (LZW) sont requis pour visionner cette image. GOBASE (2005)
Whole sequence Blastn GeneScan mRNA for each gene blastx Exon + Intron
Group I intron Self-splicing intron that requires an external guanine-containing nucleotide for splicing; releases the intron in a linear form.
Find group I intron • Locate the exons, then obtain the splicing sites, the U|G, N’N’G • Using citron to locate the P7, P4,P5,P6, thereafter P3 • Manually draw the structure, and to verify it.
Group II intron Self-splicing intron that does not require an external nucleotide for splicing; releases the intron in a lariat form.
Group II intron Group II introns fold into a conserved secondary structure consisting of six domains arranged around a central wheel. Domain 1 is the largest domain by far, while domain 5 is most conserved in sequence and is considered the active site of the ribozyme.
The ORF contains the seven domains common to all RTs, and also domain X and the En domain. Domain X is probably analogous to the thumb domain of other polymerases, and is associated with the maturase (splicing) activity of the protein. The D domain is a DNA binding region, and the En domain has a nuclease activity utilized in the mobility reaction.
Does the unknown sequence contain an ORF closely related to group II intron RTs? (Blast search) Is the unknown sequence >80% identical to another group II intron DNA sequence? (alignement) What is the closest relative of the intron? Can intron domains 5 and 6 be located? Identification of the 5’ end.
Conclusion • We have almost finished annotatingboth sequences • We have identified some of the group I and group II introns in the two sequences and also predicted their secondary structure.
Future work • Try to obtain all introns’ secondary structure in sequence 2. • Convert the annotation to a master file format.
Thanks to everyone. QUESTIONS ?