Multiple Alignment –

Multiple Alignment – Υλικό βασισμένο στο κεφάλαιο 14 του βιβλίου: Dan Gusfield, Algorithms on Strings, Trees and Sequences, Cambridge University Press

Three cοmmοn representations • There are three common kinds of family representations that come from multiple string comparison: • Profile representations • Consensus sequence representations • Signature representations.

Family representations and alignments with profiles • Definition: Given a multiple alignment of a set of strings, a profile for that multiple alignment specifies for each column the frequency that each character appears in the column. A profile is sometimes also called a weight matrix in the biological literature.

How to optimally align a string to a profile • Definition: For a character y and column j, let p(y,j) be the frequency that character y appears in column j of the profile, and let S(x,j) denote the score for aligning x with column j. • Let V(i,j) denote the value of the optimal alignment of substring S[1..i] with the first j columns of C

Signature representations οf families • The major collections of signatures in protein are the ΡROSΙTE database and the BLOCKS database derived from it. • Helicases are proteins that help unwind double-stranded DNΑ so that the DNA can be read for duplication, transcription, recombination, οr repair. • Α largefractionof the available information on the structure and possible functions of the helicases has been obtained by computer- assisted comparative analysis of their amino acid sequences. This approach has led to the delineation of motifs and patterns that are conserved in different subsets of the helicases.

Introduction to computing multiple string alignments • Definition: Given a set of k>2 strings S={S1,S2,..,Sk}, a local multiple alignment of S is obtained by selecting one substring Si’ from each string and then globally aligning those substrings

How to score multiple alignments • Definition: Given a multiple alignment M, the induced pairwise alignment of two strings Si and Sj is obtained from M by removing all rows except the two rows for Si and Sj. That is, the induced alignment is the multiple alignment M restricted to Si and Sj. Any two opposing spaces in that induced alignment can be removed if desired. • Definition: The score of an induced pairwise alignment is determined using any chosen scoring scheme for two-string alignment in the standard manner.

Multiple alignment with the sum-of-pairs (SP) objective function • Definition: The sum of pairs (SP) score of a multiple alignment M is the sum of the scores of pairwise global alignments induced by M. • The SΡ alignment problem Compute a global multiple alignment M with minimum sιm-of-pairs score.

An exact solution to the SP alignment problem • Definition: Let S1, S2 and S3 denote three strings of lengths n1, n2 and n3, respectively, and let D(i,j,k) be the optimal SP score for aligning S1[1..i], S2[1..j] and S3[1..k]. The score for a match, mismatch, or space is specified by the variables smatch, smis, and sspace, respectively.

Recurrences fοr a nonbοundary cell(i, j) For i=1 to n1 do For j=l to n2 do For k=l to n3 do begin if (S1(i) = S2(j)) then cij = smatch else cij = smis; if (S1(i) = S3(k)) then cik= smatch else cik = smis; if (S2(j) = S3(k)) then cjk= smatch else ιjk := smis; d1= D(i-1, j-1, k-1) + cij + cik + cjk; d2= D(i-1, j-1,k) + cij + 2*sspace; d3= D(i- 1, j, k- 1) + cik + 2xsspace; d4= D(i, j- 1,k-1) + cjk + 2*sspace; d5= D(i-1, j, k) + 2*sspace; d6= D(i, j- 1, k) + 2*sspace; d7= D(i, j, k- 1) + 2*sspace; D(i, j, k) :: Min[d1, d2, d3, d4, d5, d6, d7]; end;

A speedup for the exact solution • Definition: Let d1,2(i,j) be the edit distance between suffixes S1[l..n] and S2[j..n] of strings S1 and S2. Define d1,3(i,k) and d2,3(j,k) analogously. • Key idea Recall that D(i, j,k) is the optimal SP score for aligning S1[1..i], S2[1.. j],and S3[1 ..k). If D(i, j, k) + d1,2(i, j) + d1,3(i, k) + d2,3( j, k) is greater than z then node (i, j, k) cannot be on any optimal path and so (in a forward computation) D(i, j, k) need not be sent forward to any cell.

Multiple Alignment –

Multiple Alignment –

Presentation Transcript

Chapter 13 Introduction to Multiple Regression

“ WIDA Standards and ESL Curriculum Alignment”

Multiple Sclerosis

Image alignment

MULTIPLE REGRESSION ANALYSIS

MSA- multiple sequence alignment

Multiple Sclerosis for the Non-Specialist

Multiple Intelligences

Steering and Alignment

The Multiple Regression Model

PPI Network Alignment

Sequence Alignment and Phylogenetic Analysis

Multiple Alignment

Sequence Alignment

Sequence Alignment

DATA BASE ALIGNMENT PROGRAM (DAPs)

Global Sequence Alignment by Dynamic Programming

Shaft Alignment

DETECTOR ALIGNMENT with tracks

Multiple regression

BA572 – Week # 4 Strategic Alignment of IT with Business Strategy

NGS Bioinformatics Workshop 1.3 Sequence Alignment and Searching