40 likes | 57 Views
How Blast Works Role of Word Size, T, & Scoring Matrix in Seeding. Query: METGAAAALGMALAAGLGALGAAIGDGICTSKLLEGVARQPEARGQLMTLMFISVGLIESIPIIAVVVAFMLMGKIA Database entry#1: MEVGAAAAIATGLAVGLGALGAAVGDGICTGKAIESIARQPEAKGTIQTTMFISVGLIESIPIIAVVLAFMLFGKLG. Database entry#2:
E N D
How Blast WorksRole of Word Size, T, & Scoring Matrix in Seeding Query: METGAAAALGMALAAGLGALGAAIGDGICTSKLLEGVARQPEARGQLMTLMFISVGLIESIPIIAVVVAFMLMGKIA Database entry#1: MEVGAAAAIATGLAVGLGALGAAVGDGICTGKAIESIARQPEAKGTIQTTMFISVGLIESIPIIAVVLAFMLFGKLG Database entry#2: MEIVLGMTAIAVALLIGMGALGTAIGFGLLGGKFLEGAARQPEMAPMLQVKMFIVAGLLDAVTMIGVGIALFMLFTNPLGAML Database entry#3: MDMSLQVLGNLNGLTAVAVALLISLPALGTAIGFGVLGGKYLEGVARQPELGGMLLGRMFIVAAFVDAFAAISIAIGFLVLYANPLAIPGLAETAQKVIGS • Assume a word size of 3, T = 999, & Blosum62 matrix. • How many word hits to database entry #1, #2, & #3? • (2) Does entry #3 look better if we lower word size to 2? • (3) Does entry #3 look better if T = 10?
How Blast WorksRole of X in Extension Query: METGAAAALGMALAAGLGALGAAIGDGICTSKLLEGVARQPEARGQLMTLMFISVGLIESIPIIAVVVAFMLMGKIA Database entry#1: MEVGAAAAIATGLAVGLGALGAAVGDGICTGKAIESIARQPEAKGTIQTTMFISVGLIESIPIIAVVLAFMLFGKLG Database entry#2: MEIVLGMTAIAVALLIGMGALGTAIGFGLLGGKFLEGAARQPEMAPMLQVKMFIVAGLLDAVTMIGVGIALFMLFTNPLGAML Database entry#3: MDMSLQVLGNLNGLTAVAVALLISLPALGTAIGFGVLGGKYLEGVARQPELGGMLLGRMFIVAAFVDAFAAISIAIGFLVLYANPLAIPGLAETAQKVIGS • Try to extend off of the seed shown in red using X = 5. • (2) Try to extend off of the seed shown in red using X = 2.
How Blast WorksRole of Scoring Matrix in Evaluation Query 8 ALGMALAAGLGALGAAIGDGICTSKLLEGVARQPEARGQLMTLMFISVGLIESIPIIAVV 67 A+ +AL G+GALG AIG G+ K LEG ARQPE L MFI GL++++ +I V Sbjct 9 AIAVALLIGMGALGTAIGFGLLGGKFLEGAARQPEMAPMLQVKMFIVAGLLDAVTMIGVG 68 Query 68 VA-FML 72 +A FML Sbjct 69 IALFML 74 • Score the alignment above with the Blosum62 matrix, and Gap Penalties (Existence: 15, Extension: 2). • (2) How would the score change if you only scored up to the end of the first line?
How Blast WorksTry It For Yourself METGAAAALGMALAAGLGALGAAIGDGICTSKLLEGVARQPEARGQLMTLMFISVGLIESIPIIAVVVAFMLMGKIA • Blast against the nr database of proteins. • Blast against just the microbial genome (proteins) database. • Blast against just the Desulfutomaculum reducens genome.