1 / 27

Four components to a BLAST search

Four components to a BLAST search. (1) Choose the sequence ( query ) (2) Select the BLAST program (3) Choose the database to search (4) Choose optional parameters Then click “ BLAST ”. Step 1: Choose your sequence. Sequence can be input in FASTA format or as accession number.

maya-petty
Download Presentation

Four components to a BLAST search

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Four components to a BLAST search (1) Choose the sequence (query) (2) Select the BLAST program (3) Choose the database to search (4) Choose optional parameters Then click “BLAST”

  2. Step 1: Choose your sequence Sequence can be input in FASTA format or as accession number

  3. Example of the FASTA format for a BLAST query

  4. Step 2: Choose the BLAST program

  5. Step 2: Choose the BLAST program blastn (nucleotide BLAST) blastp (protein BLAST) tblastn (translated BLAST) blastx (translated BLAST) tblastx (translated BLAST)

  6. Choose the BLAST program ProgramInputDatabase 1 blastnDNADNA 1 blastpproteinprotein 6 blastxDNAprotein 6 tblastnproteinDNA 36 tblastxDNADNA

  7. DNA potentially encodes six proteins 5’ CAT CAA 5’ ATC AAC 5’ TCA ACT 5’ CATCAACTACAACTCCAAAGACACCCTTACACATCAACAAACCTACCCAC 3’ 3’ GTAGTTGATGTTGAGGTTTCTGTGGGAATGTGTAGTTGTTTGGATGGGTG 5’ 5’ GTG GGT 5’ TGG GTA 5’ GGG TAG

  8. Step 3: choose the database nr = non-redundant (most general database) dbest = database of expressed sequence tags dbsts = database of sequence tag sites gss = genomic survey sequences htgs = high throughput genomic sequence

  9. CD search Step 4a: Select optional search parameters Compares protein sequences to the Conserved Domain Database. The CDD is a database containing a collection of functional and/or structural domains derived from two popular collections, Smart and Pfam, plus contributions from colleagues at NCBI. For more information please see the CDD homepage.

  10. Step 4a: Select optional search parameters Entrez! Filter Expect Word size organism Scoring matrix

  11. BLAST: optional parameters You can... • choose the organism to search • turn filtering on/off • change the substitution matrix • change the expect (e) value • change the word size • change the output format

  12. filtering

  13. Step 4b: optional formatting parameters Alignment view Descriptions Alignments

  14. program query database taxonomy

  15. taxonomy

  16. High scores low e values Cut-off: .05? 10-10?

  17. BLAST format options

  18. BLAST format options: multiple sequence alignment

  19. We will get to the bottom of a BLAST search in a few minutes…

More Related