70 likes | 173 Views
Baseline: Are we at the same stage? Cygwin installed Blast installed Data files: TA496Seq1.txt, PhytophSeq1.txt, TomatoSequence.txt Were the files completely downloaded? In Cygwin Try: grep –c “>” PhytophSeq1.txt 3,921 Try: grep –c “>” TA496Seq1.txt 116,711. Format the database:
E N D
Baseline: Are we at the same stage? Cygwin installed Blast installed Data files: TA496Seq1.txt, PhytophSeq1.txt, TomatoSequence.txt Were the files completely downloaded? In Cygwin Try: grep –c “>” PhytophSeq1.txt 3,921 Try: grep –c “>” TA496Seq1.txt 116,711
Format the database: /cygdrive/c/Blast/bin/formatdb -i ./TA496Seq1.txt –p F Run nucleotide BLAST (blastn) /cygdrive/c/Blast/bin/blastall -p blastn -d ./TA496Seq1.txt -i ./TomatoSequence.seq –o TomatoSeqOut.txt /cygdrive/c/Blast/bin/blastall -p blastn -d ./TA496Seq1.txt -i ./PhtophSeq1.txt –o PhytOut.txt NOTE: this blast which compares 3,921 sequences to a database of 116,711 sequences will take some time (15 minutes on my laptop).
OUTPUT of BLAST of TA496Seq1.txt with TomatoSequence.txt Score E Sequences producing significant alignments: (bits) Value gi|9292199|gb|BE354223.1|BE354223 EST355566 tomato flower buds, ... 1237 0.0 gi|16248018|gb|BI933546.1|BI933546 EST553435 tomato flower, anth... 1017 0.0 gi|4384985|gb|AI489614.1|AI489614 EST247953 tomato ovary, TAMU S... 908 0.0 gi|7410529|gb|AW649291.1|AW649291 EST327745 tomato germinating s... 40 0.12 gi|8105118|gb|AW929717.1|AW929717 EST353987 tomato flower buds 8... 40 0.12 . . . gi|16248689|gb|BI934217.1|BI934217 EST554106 tomato flower, anth... 34 7.2 gi|16248853|gb|BI934381.1|BI934381 EST554270 tomato flower, anth... 34 7.2
OUTPUT of BLAST of TA496Seq1.txt with TomatoSequence.txt >gi|9292199|gb|BE354223.1|BE354223 EST355566 tomato flower buds, anthesis, Cornell University Solanum lycopersicum cDNA clone cTOD9L3, mRNA sequence Length = 632 Score = 1237 bits (624), Expect = 0.0 Identities = 630/632 (99%) Strand = Plus / Plus Query: 1504 gactggctagaatggctgcaatcatggcatctacttacaaggcttatcttggcgtcggac 1563 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 1 gactggctagaatggctgcaatcatggcatctacttacaaggcttatcttggcgtcggac 60 Query: 1564 ttggtccactatcatttttgacgcagtatagaataccacatcctggaagagttggtggaa 1623 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 61 ttggtccactatcatttttgacgcagtatagaataccacatcctggaagagttggtggaa 120
OUTPUT of BLAST of TA496Seq1.txt with TomatoSequence.txt >gi|9292199|gb|BE354223.1|BE354223 EST355566 tomato flower buds, anthesis, Cornell University Solanum lycopersicum cDNA clone cTOD9L3, mRNA sequence Length = 632 Score = 1237 bits (624), Expect = 0.0 Identities = 630/632 (99%) Strand = Plus / Plus Query: 1504 gactggctagaatggctgcaatcatggcatctacttacaaggcttatcttggcgtcggac 1563 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 1 gactggctagaatggctgcaatcatggcatctacttacaaggcttatcttggcgtcggac 60 Query: 1564 ttggtccactatcatttttgacgcagtatagaataccacatcctggaagagttggtggaa 1623 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 61 ttggtccactatcatttttgacgcagtatagaataccacatcctggaagagttggtggaa 120
In Cygwin Try: grep –c “Strand =“ ./TomatoSeqOut.txt 82 Try: grep –c “Stand =“ ./PhytOut.txt 292,568 Try: grep –c “Expect = 0.0” ./TomatoSeqOut.txt 3 Try: grep –c “Expect = 0.0” ./PhytOut.txt 54,643
When we have a large output file from BLAST, how can we find out what is inside? How can we organize and interpret this output when the file is too large to open in a text editor?