1 / 9

praktisch BLASTen & BLAST-Outputs

praktisch BLASTen & BLAST-Outputs. ATGCTG TGGCAG CGTGCA GTCCAG TCTCGT ACTGCAT. Ein praktisches Beispiel. 2.869.704 annotierte Proteine. 1.506 kartierte Gersten-Gene. BlastX. Ergebnis: 905 Annotation Laufzeit: 17,5 h. Lösung: Verteilung der Analysen. IPK Cluster BROCKEN.

jovan
Download Presentation

praktisch BLASTen & BLAST-Outputs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. praktisch BLASTen & BLAST-Outputs

  2. ATGCTG TGGCAG CGTGCA GTCCAG TCTCGT ACTGCAT Ein praktisches Beispiel 2.869.704annotierteProteine 1.506 kartierteGersten-Gene BlastX Ergebnis: 905 Annotation Laufzeit: 17,5 h

  3. Lösung: Verteilung der Analysen

  4. IPK Cluster BROCKEN Ergebnis: 905 Annotation 72 Nodes -> Laufzeit: 16 min

  5. CEF GUI CEF SOAP Web Services file server /data/pdw-20/ file server /data/pdw-16/ • Metadata about • Tools (NCBI BLAST, Spidey, …) • Tool parameters (-i FASTA-query, …) • Files (FASTA, blastable, …) • Jobs/sub jobs (progress, finished, …) master/head node pdw-22 … 22 nodes CEF: Cluster Execution Framework #!/bin/bash projdir=/data/pdw-16/agbi/projects/ #split query file python2.3 /data/pdw-20/python_scripts/splitFas2.py -i Clones.fasta -o $projdir -n 500 blast_db=$projdir/wheat_consensus.txt mergescript=$projdir/domerge.sh echo "#!/bin/sh" > $mergescript echo "cat \\" >> $mergescript z=0 for i in split/* do script_file=$projdir/script/blastjob_$$_$z.sh result_file=$projdir/result/blastresult_$$_$z.txt log_file=$projdir/log/joblog_$$_$z echo "#!/bin/sh" > $script_file #echo "cd $projdir" >> $script_file echo "/usr/bin/blastall -i $projdir/$i -p blastn -d $blast_db -m0 -e 1E-10 -v 10 -b 10 -o $result_file" >> $script_file echo "$result_file \\" >> $mergescript qsub -o $log_file.out -e $log_file.err -q long $script_file echo "qsub -o $log_file.out -e $log_file.err -q long $script_file" z=`expr $z + 1` done echo ">final_result.txt" >> $mergescript echo "rm log/* script/* " >> $mergescript

  6. CEF: APEX GUI

  7. Eingabe EST-Sequenz >HY01A03T GAATTCGGCACCAGAGTGAGCACGCAAGCCAGTGTTTGTAGCCAGCAGCCACAATGGCCGGGAACATGCT AGCCAACTATGTCCAAGTCTACGTCATGCTCCCGCTGGATGTCGTGAGCGTCGACAACAAGTTCGAGAAG GGCGACGAGATCAGGGCGCAGCTGAAGAAGCTGACGGAGGCTGGCGTGGACGGCGTCATGATAGACGTCT GGTGGGGGCTGGTGGAGGGCAAGGGCCCCAAGGCCTACGACTGGAGCGCCTACAAGCAGGTCTTCGACCT GGTGCACGAGGCCAGGCTCAAGCTGCAGGCCATCATGTCGTTCCACCAGTGCGGTGGCAACGTCGGCGAC GTAGTCAACATCCCCATCCCACAGTGGGTGCGGGATGTCGGCGCTACCGACCCCGACATTTTCTACACGA ACCGCAGAGGGACGAGGAACATCGAGTACCTCACCCTTGGAGTGGATGACCAACCTCTCTTCCATGGAAG AACTGCCGTCCAGATGTATCATGATTACATGGCGAGCTTCAGGGAAAACATGAAAAAGTTCTTGGATGCC GGTACCATCGTGGACATTGAAGTGGGACTTGGCCCGGCTGGAGAGATGAGGTACCCATCCTATCCTCAGA GCCAGGGATGGGTCTTCCCAGGCATCGGAGAATTCATCTGCTATGATAAGTACCTGGAAGCAGACTTCAA

  8. >HY01A03T Length = 700 Plus Strand HSPs: Score = 2595 (395.4 bits), Expect = 3.0e-112, P = 3.0e-112 Identities = 573/618 (92%), Positives = 573/618 (92%), Strand = Plus / Plus Query: 77 CTATGTCCAAGTCTACGTCATGCTCCCGCTGGATGTCGTGAGC--GT-CGACAACAAGTT 133 ||| ||| | | || | | | | || || |||| | | || ||| || Sbjct: 89 CTACGTC-ATG-CTCCCGCTGGATGTCG-TGAGCGTCGACAACAAGTTCGAGAAGGGCGA 145 Query: 134 CGAGA--AGGGCGACGAGATCAGGAAGCTGACGGAGGCTGGCGTGGACGGCGTCATGATA 191 ||||| |||||| | || | | ||||||||||||||||||||||||||||||||||||| Sbjct: 146 CGAGATCAGGGCG-C-AGCTGAAGAAGCTGACGGAGGCTGGCGTGGACGGCGTCATGATA 203 Query: 192 GACGTCTGGTGGGGGCTGGTGGAGGGCAAGGGCCCCAAGGCCTACGACTGGAGCGCCTAC 251 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 204 GACGTCTGGTGGGGGCTGGTGGAGGGCAAGGGCCCCAAGGCCTACGACTGGAGCGCCTAC 263 Query: 252 AAGCAGGTCTTCGACCTGGTACACGAGGCCAGGCTCAAGCTGCAGGCCATCATGTCGTTC 311 |||||||||||||||||||| ||||||||||||||||||||||||||||||||||||||| Sbjct: 264 AAGCAGGTCTTCGACCTGGTGCACGAGGCCAGGCTCAAGCTGCAGGCCATCATGTCGTTC 323 Query: 312 CACCCCGTGCGGTGGCAACGTCGGCGACGTAGTCAACATCCCCATCCCACAGTGGGTGCG 371 |||| |||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 324 CACCA-GTGCGGTGGCAACGTCGGCGACGTAGTCAACATCCCCATCCCACAGTGGGTGCG 382 Query: 372 GGATGTCGGCGCTACCGACCCCGACATTTTCCACACGAACCTCAGAGGGACGAGGAACAT 431 ||||||||||||||||||||||||||||||| ||||||||| |||||||||||||||||| Sbjct: 383 GGATGTCGGCGCTACCGACCCCGACATTTTCTACACGAACCGCAGAGGGACGAGGAACAT 442 Query: 432 CGAGTACCTCACCCTTGGAGTGGATGACCAACCTCTCTTCCATGGAAGAACTGCCGTCCA 491 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 443 CGAGTACCTCACCCTTGGAGTGGATGACCAACCTCTCTTCCATGGAAGAACTGCCGTCCA 502 Query: 492 GATGTATCATGATTACATGGCGAGCTTCAGGGAAAACATGAAAAAGTTCTTGGATGCCGG 551 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 503 GATGTATCATGATTACATGGCGAGCTTCAGGGAAAACATGAAAAAGTTCTTGGATGCCGG 562 Query: 552 TACCATCGTGGACA---A-GTGGGACTTGGCCCGGCTGGAGAGATGAGGTACCCATCCTA 607 |||||||||||||| | ||||||||||||||||||||||||||||||||||||||||| Sbjct: 563 TACCATCGTGGACATTGAAGTGGGACTTGGCCCGGCTGGAGAGATGAGGTACCCATCCTA 622 Query: 608 TCCTCAGAGCCAGGGATGGGTCTTCCCAGGCATCGGAGAATTCATCTGCTATGATAAGTA 667 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 623 TCCTCAGAGCCAGGGATGGGTCTTCCCAGGCATCGGAGAATTCATCTGCTATGATAAGTA 682 Query: 668 CCTGGAAGCAGACTTCAA 685 |||||||||||||||||| Sbjct: 683 CCTGGAAGCAGACTTCAA 700 BlastN-Resultat

  9. BlastX-Resultat >dbj|BAC83773.1| Gene info putative beta-amylase [Oryza sativa (japonica cultivar-group)] gb|EAZ40178.1| hypothetical protein OsJ_023661 [Oryza sativa (japonica cultivar-group)] Length=488 Score = 403 bits (1036), Expect = 4e-111 Identities = 191/215 (88%), Positives = 200/215 (93%), Gaps = 0/215 (0%) Frame = +3 Query 54 MAGNMLANYVQVYVMLPLDVVSVDNKFEKGDEIRAQLKKLTEAGVDGVMIDVWWGLVEGK 233 MAGN+LANYVQV VMLPLDVV+VDNKFEK DE RAQLKKLTEAGVDGVM+DVWWGLVEGK Sbjct 1 MAGNLLANYVQVNVMLPLDVVTVDNKFEKVDETRAQLKKLTEAGVDGVMVDVWWGLVEGK 60 Query 234 GPKAYDWSAYKQVFDLVHEARLKLQAIMSFHQCGGNVGDVVNIPIPQWVRDVGATDPDIF 413 GP +YDW AYKQ+F LV EA LKLQAIMSFHQCGGNVGD+VNIPIPQWVRDVGA+DPDIF Sbjct 61 GPGSYDWEAYKQLFRLVQEAGLKLQAIMSFHQCGGNVGDIVNIPIPQWVRDVGASDPDIF 120 Query 414 YTNRRGTRNIEYLTLGVDDQPLFHGRTAVQMYHDYMASFRENMKKFLDAGTIVDIEVGLG 593 YTNR G RNIEYLTLGVDDQPLFHGRTA+QMY DYM SFRENM +FLD G IVDIEVGLG Sbjct 121 YTNRGGARNIEYLTLGVDDQPLFHGRTAIQMYADYMKSFRENMAEFLDTGVIVDIEVGLG 180 Query 594 PAGEMRYPSYPQSQGWVFPGIGEFICYDKYLEADF 698 PAGEMRYPSYPQSQGWVFPGIGEFICYDKYLEADF Sbjct 181 PAGEMRYPSYPQSQGWVFPGIGEFICYDKYLEADF 215

More Related