240 likes | 362 Views
Genes Determining Plant-Cyanobacterial Symbioses and Consideration of Blast. Thursday, 5 June 2008. Problems in sequence analysis. Identification by sequence similarity.
E N D
Genes Determining Plant-Cyanobacterial Symbioses and Consideration of Blast Thursday, 5 June 2008 • Problems in sequence analysis • Identification by sequence similarity This demonstration is best viewed as a slide show,enabling you to simulate a session and make changes in cursor position more obvious.To do this, click Slide Show on the top tool bar, then View show. Click anywhere to go on to the next slide
Gland development is stimulated by N-limitation Gland suppressed by presence of fixed N Plant starved for N makes gland to house cyanobacteria What's special about the gland? What genes are specifically expressed in glands? 10 mM nitrate 0.1 mM nitrate
Construction of a cDNA library from Gunnera gland mRNA ends with polyA tails Use modified polyT to direct synthesis of DNA copy of mRNAReverse Transcriptase (RT) adds CCC to end. Add 2nd adapter, using GGG to attach to CCC. Extend cDNA
3'-TTTTTCTTTTTTCATGGCTGACGCTGAGACGCAACTATGGTGACGAA-5' Construction of a cDNA library from Gunnera gland(Same protocol, but with real sequences) 5'-NNNNNNNNNN ... NNNNNNNNNNAAAAAAAAAAAAAAAAA...-3' Use modified polyT adapter to direct synthesis of DNA copy of mRNA
3'-TTTTTCTTTTTTCATGGCTGACGCTGAGACGCAACTATGGTGACGAA-5' Construction of a cDNA library from Gunnera gland 5'-NNNNNNNNNN ... NNNNNNNNNNAAAAAAAAAAAAAAAAA...-3' Use modified polyT adapter to direct synthesis of DNA copy of mRNA The adapter can bind to many positions in polyA tail, resulting in variation in number of T's in cDNA sequence.
3'-TTTTTCTTTTTTCATGGCTGACGCTGAGACGCAACTATGGTGACGAA-5' Construction of a cDNA library from Gunnera gland 5'-NNNNNNNNNN ... NNNNNNNNNNAAAAAAAAAAAAAAAAA...-3' Use modified polyT adapter to direct synthesis of DNA copy of mRNA The adapter can bind to many positions in polyA tail, resulting in variation in number of T's in cDNA sequence.
TTTTTCTTTTTTCATGGCTGACGCTGAGACGCAACTATGGTGACGAA-5' Construction of a cDNA library from Gunnera gland 5'-NNNNNNNNNN ... NNNNNNNNNNAAAAAAAAAAAAAAAAA...-3' 3'-CCCNNNNNNNNNN ... NNNNNNNNNN Reverse Transcriptase (RT) extends the adapter to the end of the mRNA and adds CCC to the 3' end.
TTTTTCTTTTTTCATGGCTGACGCTGAGACGCAACTATGGTGACGAA-5' 5'-AAGCAGTGGTATCAACGCAGAGTGGCCATTACGGCCGGG Construction of a cDNA library from Gunnera gland 5'-NNNNNNNNNN ... NNNNNNNNNNAAAAAAAAAAAAAAAAA...-3' 3'-CCCNNNNNNNNNN ... NNNNNNNNNN 3'-CCCNNNNNNNNNN ... A second adapter is added which (with the help of antibodies to) uses three G's to bind to the three .C's.
TTTTTCTTTTTTCATGGCTGACGCTGAGACGCAACTATGGTGACGAA-5' 5'-AAGCAGTGGTATCAACGCAGAGTGGCCATTACGGCCGGG Construction of a cDNA library from Gunnera gland 5'-NNNNNNNNNN ... NNNNNNNNNNAAAAAAAAAAAAAAAAA...-3' 3'-CCCNNNNNNNNNN ... NNNNNNNNNN TTCGTCACCATAGTTGCGTCTCACCGGTAATGCCGG CCCNNNNNNNNNN ... The cDNA sequence is extended to the left, using the second adapter as a template.
TTTTTCTTTTTTCATGGCTGACGCTGAGACGCAACTATGGTGACGAA-5' 5'-AAGCAGTGGTATCAACGCAGAGTGGCCATTACGGCCGGG Construction of a cDNA library from Gunnera gland 5'-NNNNNNNNNN ... NNNNNNNNNNAAAAAAAAAAAAAAAAA...-3' 3'-CCCNNNNNNNNNN ... NNNNNNNNNN NNNNNNNNNN ... TTCGTCACCATAGTTGCGTCTCACCGGTAATGCCGG CCCNNNNNNNNNN ... The cDNA sequence is extended to the left, using the second adapter as a template… …and then the second cDNA is strand is synthesized left-to-right, using the first cDNA strand as the template.
TTTTTCTTTTTTCATGGCTGACGCTGAGACGCAACTATGGTGACGAA-5' Hundreds to thousands of nucleotides 5'-AAGCAGTGGTATCAACGCAGAGTGGCCATTACGGCCGGG Construction of a cDNA library from Gunnera gland 5'-NNNNNNNNNN ... NNNNNNNNNNAAAAAAAAAAAAAAAAA...-3' 3'-CCCNNNNNNNNNN ... NNNNNNNNNN NNNNNNNNNN ... TTCGTCACCATAGTTGCGTCTCACCGGTAATGCCGG CCCNNNNNNNNNN ... To give some perspective, the adapters are about 50 nucleotides, while the mRNA itself can be as large as a couple of thousands of nucleotides.
Construction of a cDNA library from Gunnera gland Of course there are thousands of different mRNA's in a cell, leading to thousands of cDNA's in the library, all in multiple copies.
Sequencing of cDNA library Limitations: It would be nice to be able to sequence the cDNA's from end to end, but that's not presently possible. Sequencing has its limitations. - Only from ends - Only ~400 nt
Sequencing of cDNA library Limitations: - Only from ends - Only ~400 nt The solution is to break up the cDNA so that there are multiple, overlapping ends from which to sequence. In this way, all the full length of the cDNA can be sequenced Solution: - Break the cDNA
Sequencing of cDNA library (1000's of cDNA's) The broken fragments are read from either end (at random). If there are enough reads, it is possible to use overlaps to reassemble the original sequence. Unfortunately, the adapters are also sequenced, and these complicate the assembly process, as they're interpreted as overlapping sequences, leading to misassembly. They need to be removed.
Sequencing of cDNA library (1000's of cDNA's) Given the number of sequences, the removal process obviously must be automated, but automated processes, while fast, are often stupid. We need to check to make sure they worked.
Identifying elements of cDNA library The assembly process should, in theory, also remove duplicate sequences.
Identifying elements of cDNA library The assembly process should, in theory, also remove duplicate sequences. In practice, partial duplicates may remain, and it is necessary to keep an eye out for them.
Identifying elements of cDNA library Predict function directly from sequence How to go from cDNA sequence to predicted function for the sequences? You might think that since we can readily predict a protein sequence from a DNA sequence, it should be possible to predict function as well.
Identifying elements of cDNA library Predict function directly from sequence Predict function from sequence similarity Nope. At present that's impossible. The best we can do is to compare sequences with sequences from other organisms where there is experimental evidence as to function.
Identifying elements of cDNA library Predict function directly from sequence Predict function from sequence similarity Blast is a tool to do just that, comparing a given sequence against at database of known sequences. It is important to understand the mind of Blast. But that is a subject for another time.
Genes Determining Plant-Cyanobacterial Symbioses and Consideration of Blast We've identified many things that need to be done: 1. Determine if primers been removed from sequences. 2. Determine if the library contains duplicates 3. Identify protein sequences similar to those encoded by cDNAs 4. (plus one extra) Find where in the cDNAs genes begin and end
Genes Determining Plant-Cyanobacterial Symbioses and Consideration of Blast These questions are ordinarily answered by high-powered computer types. But you can answer them yourself. First you need to read in the data. Go into StaphyloBIKE through the BioBIKE portal(Gunnera isn't a member of the Staphylococcus, of course, but I put the cDNA sequences in that instance of BioBIKE) RUN-FILE "contig-resources.bike" SHARED(this makes the cDNA sequences available to you as a variable called gunnera-contigs and also provides you with a possibly useful tool READ-NAMED to extract specific sequences)
Genes Determining Plant-Cyanobacterial Symbioses and Consideration of Blast Possibly useful functions: SEQUENCE-SIMILAR-TOAccesses BLAST, using as targets either internal data(i.e. gunnera-contigs) or external data (i.e. *GENBANK*)Also used to look for nearly identical sequences, using the MISMATCHES option. READING-FRAMES-OFTranslates the sequence in all six possible reading frames.