500 likes | 685 Views
Patterns, Profiles, and Multiple Alignment. OUTLINE. Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways of Obtaining Multiple Alignments Sequence Pattern Discovery. OUTLINE.
E N D
Patterns, Profiles, and Multiple Alignment
OUTLINE • Profiles and Sequence Logos • Profile Hidden Markov Models • Aligning Profiles • Multiple Sequence Alignments by Gradual Sequence Adition • Other Ways of Obtaining Multiple Alignments • Sequence Pattern Discovery
OUTLINE • Profiles and Sequence Logos • Profile Hidden Markov Models • Aligning Profiles • Multiple Sequence Alignments by Gradual Sequence Adition • Other Ways of Obtaining Multiple Alignments • Sequence Pattern Discovery
Aligning Profiles Comparing two PSSMs by alignment Can not done by standard alignment techniques, Consşder alignement of two columns, one from each PSSM: Both are in fact scores, Use measure of the similarity between the scores in the two columns.
Aligning Profiles Comparing two PSSMs by alignment The Program LAMA (Local Alignment of Multiple Alignments:) Do not allow gaps in the alignment of PSSMs, Uses Pearson correlation coefficient as similarity mesure, The score of each column reanges from 1 to -1.
Aligning Profiles Comparing two PSSMs by alignment
Multiple Sequence Alignments by Gradual Sequence Adition Modified pairwise dynamic programming: Pairwise dynamic programming algorithms can be modified to find the optimal alignment of more than two sequences,
Multiple Sequence Alignments by Gradual Sequence Adition Modified pairwise dynamic programming: Align 3 sequences: SEQUENCE 1 SEQUENCE 2 SEQUENCE 3
Multiple Sequence Alignments by Gradual Sequence Adition Modified pairwise dynamic programming: Align 3 sequences:
Multiple Sequence Alignments by Gradual Sequence Adition Modified pairwise dynamic programming: Align 3 sequences:
Multiple Sequence Alignments by Gradual Sequence Adition Modified pairwise dynamic programming: RESULT: dynamic programming approach for alignment between two sequences is easily extended to k sequences, For k sequences we need to deal with a k-dimensional matrix, Therefore, it is impractical due to exponential running time
Multiple Sequence Alignments by Gradual Sequence Adition Progressive alignment: Multiple alignments are built up by gradually adding sequences, The order in which they are aded can be crucial to the successful generation of an accurate alignment, There are different ways to determine this addition.
Multiple Sequence Alignments by Gradual Sequence Adition Progressive alignment (ClustalW): Dynamic programming, Sum-of-pairs scoring method, organize multiple sequence alignment using a guide tree where leaves represent sequences and internal nodes represent alignments,
Multiple Sequence Alignments by Gradual Sequence Adition Progressive alignment (ClustalW):
Multiple Sequence Alignments by Gradual Sequence Adition Progressive alignment (ClustalW): Steps: Find similarity matrix.
Multiple Sequence Alignments by Gradual Sequence Adition Progressive alignment (ClustalW): Steps: Cluster analysis (tree construction).
Multiple Sequence Alignments by Gradual Sequence Adition Progressive alignment (ClustalW): Steps: Align sequences according to the order determined by the tree:
Multiple Sequence Alignments by Gradual Sequence Adition Progressive alignment (ClustalW): Steps: Align sequences according to the order determined by the tree:
Multiple Sequence Alignments by Gradual Sequence Adition Progressive alignment (ClustalW): depending on the internal node in the tree, we may have to align a a sequence with a sequence a sequence with a profile a profile with a profile in all cases we can use dynamic programming for the profile cases, use SP (sum-of-pairs) scoring
Multiple Sequence Alignments by Gradual Sequence Adition • Progressive alignment (ClustalW): • Sum of Pairs Scoring: • Consider all possible pairs.
Multiple Sequence Alignments by Gradual Sequence Adition Progressive alignment (ClustalW): Sum of Pairs Scoring:
Multiple Sequence Alignments by Gradual Sequence Adition Progressive alignment (ClustalW): Sum of Pairs Scoring: Assume c(match) = 1 , c(mismatch) = -1 , and c(gap) = -2 , also assume c(-, -) = 0 to prevent the double counting of gaps.
Multiple Sequence Alignments by Gradual Sequence Adition Progressive alignment (ClustalW): Sum of Pairs Scoring: Assume c(match) = 1 , c(mismatch) = -1 , and c(gap) = -2 , also assume c(-, -) = 0 to prevent the double counting of gaps.
Multiple Sequence Alignments by Gradual Sequence Adition Progressive alignment (Star Alignment): Select a sequence c as the center of the star, For each sequencex1, …, xk such that index i ≠ c, perform a Needleman-Wunsch global alignment Aggregate alignments with the principle “once a gap, always a gap.”
Multiple Sequence Alignments by Gradual Sequence Adition Progressive alignment (Star Alignment): Select the center sequence:
Multiple Sequence Alignments by Gradual Sequence Adition Progressive alignment (Star Alignment): Select the center sequence: Simply choose as xc (center sequence) the sequence xithat maximizes the following
Multiple Sequence Alignments by Gradual Sequence Adition Progressive alignment (Star Alignment): Select the center sequence EXAMPLE:
Multiple Sequence Alignments by Gradual Sequence Adition Progressive alignment (Star Alignment): Select the center sequence EXAMPLE: Compute all pairwise alignments (global alignments) and scores.
Multiple Sequence Alignments by Gradual Sequence Adition Progressive alignment (Star Alignment): Select the center sequence EXAMPLE: Compute all pairwise alignments (global alignments) and scores. sequence most similar to the rest
Multiple Sequence Alignments by Gradual Sequence Adition Progressive alignment (Star Alignment): Select the center sequence EXAMPLE:
Multiple Sequence Alignments by Gradual Sequence Adition Progressive alignment (Star Alignment): Select the center sequence EXAMPLE: Build the alignment:
Multiple Sequence Alignments by Gradual Sequence Adition Progressive alignment (Star Alignment): Select the center sequence EXAMPLE: Build the alignment:
Multiple Sequence Alignments by Gradual Sequence Adition Progressive alignment (Star Alignment): Select the center sequence EXAMPLE: Build the alignment:
Multiple Sequence Alignments by Gradual Sequence Adition Progressive alignment (Star Alignment): For highly similar sequences this method can generate a reasonable alignment, When the percentage identity between sequences is low, multiple alignment obtained by star alignment can be very poor.
Other Ways of Obtaining Multiple Alignments DIALIGN Focuses on short ungapped alignments, Complete alignment can be constructed from ungapped local alignments between pairs of sequences.
Other Ways of Obtaining Multiple Alignments DIALIGN All possible diagonals between each pair of sequences are considered,
Other Ways of Obtaining Multiple Alignments SAGA Use genetig algorithm to find the optimal alignment.
Other Ways of Obtaining Multiple Alignments SAGA Steps in genetic algorithm (GENERAL):
Other Ways of Obtaining Multiple Alignments SAGA Crossover operations in SAGA:
Other Ways of Obtaining Multiple Alignments SAGA Crossover operations (another way) in SAGA:
Sequence Pattern Discovery From multiple sequence alignments By searching for possible patterns in the set of sequences
Sequence Pattern Discovery eMOTIF: Uses 20 groups of amino acids to denote amino acids that can be substituted by each other
Sequence Pattern Discovery eMOTIF: For every position of the alignment determine which single group can cover the whole column By examining the possible column combinations, identify patterns
Sequence Pattern Discovery Example:
Sequence Pattern Discovery Example:
Sequence Pattern Discovery Example:
References • M. Zvelebil, J. O. Baum, “Understanding Bioinformatics”, 2008, Garland Science • Andreas D. Baxevanis, B.F. Francis Ouellette, “Bioinformatics: A practical guide to the analysis of genes and proteins”, 2001, Wiley. • Barbara Resch, “Hidden Markov Models - A Tutorial for the Course Computational Intelligence”, 2010.