10 likes | 131 Views
Department of Computer Science College of Mathematics & Science . Compile and Run BLAST Locally from Source Code. Preston Cofield Advisor: Gang Qian.
E N D
Department of Computer Science College of Mathematics & Science Compile and Run BLAST Locally from Source Code Preston Cofield Advisor: Gang Qian ABSTRACT: BLAST is a widely-used search tool for homology detection in large biological sequence databases. In this presentation, we provide a guidance of locating BLAST source code on NCBI and download it to a local computer. We will then show the compilation and execution of BLAST programs on both Linux and Windows platforms. Using BLAST, locally, allows study of the structure and algorithms of the BLAST source programs so that comparison research on improving search performance on biological sequence databases can be conducted. • Linux Platform • Download the source tar ball from the BLAST source code link • ncbi-blast-2.2.24+-src.tar.gz • Compilation • cd /BLASTdirectory/c++ • ./configure --without-debug --with-mt --with-build-root=ReleaseMT • cdReleaseMT/build • make all_r • After compilation: • Run Perl update_blastdb.pl database_name to download a selected database (ex. htgs, refseq_rna) • Conduct a test of BLAST’s installation for some standard nucleotide similarity search • Type ./blastdbcmd -db database_name -entry nm_000249 -outfmt "%f" -out test_query.txt • blastdbcmd takes a selected database (-db), a search string parameter (-entry), output format ( –outfmt) , and output file (-out) • Finds a sequence from -db based upon search criteria, and then place the sequence into output file using the give format • Type ./blastn -query my_query.txt -db refseq_rna -out ouput.txt • blastn takes a sequence input file (-query),a selected database (-db), and output file (-out) • Runs a nucleotide query search on the given –db, then save its results in output file • Windows Platform • Download an MSI from the download link • Windows (32-bit x86, MSI installer) • After installation: • Windows OS needs the ability to run Perl scripts • Run Perl update_blastdb.pl database_name to download a selected database (ex. htgs, refseq_rna) • All BLAST programs are ran from the command prompt • Perform a test of BLAST’s installation for some standard nucleotide similarity search • Create a new OS environment variable holding the full path to the BLAST’s bin • Facilitates inputting BLAST commands • In an open command prompt, enter the BLAST directory: • Typeblastdbcmd -db database_name -entry nm_000249 -outfmt "%f" -out test_query.txt • blastdbcmd takes a selected database (-db), a search string parameter (-entry), output format ( –outfmt) , and output file (-out) • Finds a sequence from -db based upon search criteria, and then place the sequence into output file using the give format • Type blastn -query my_query.txt -db refseq_rna -out ouput.txt • blastn takes a sequence input file (-query),a selected database (-db), and output file (-out) • Runs a nucleotide query search on the given –db, then save its results in output file • Results and Conclusion • Linux Platform • Creation of an output file containing the search results from the blastn function • Windows Platform • Creation of an output file containing the search results from the blastn function • Since BLAST can be compiled, and run locally by the user, the user gains the capability to further study, and improve upon BLAST’s heuristic algorithms • References • [1] BLAST Main Web site: http://blast.ncbi.nlm.nih.gov/Blast.cgi • [2] Altschul S, Gish W, Miller W, Myers E and Lipman D. Basic local alignment search tool. J. Molecular Biology 1990; 215(3):403-410. • [3] Altschul S, Madden T, Schäffer A, Zhang J, Zhang Z, Miller W and Lipman D. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Research 1997; 25(17):3389-3402. • Introduction • Basic Local Alignment Search Tool (BLAST) [1,2,3] is a popular search algorithm in bioinformatics, useful in analyzing homologous comparisons between biological sequences • BLAST can be run in two ways: • A Web interface provided by the National Center for Biotechnology Information (NCBI) • Running BLAST on a local computer • Running BLAST locally offers • great flexibility to its users • BLAST’s Source code link: • ftp://ftp.ncbi.nlm.nih.gov/blast/execut ables/blast+/LATEST/ • Databases Download link: • ftp://ftp.ncbi.nlm.nih.gov/blast/db/