90 likes | 258 Views
Longest Common Subsequence (LCS) - Scoring. Dr. Nancy Warter-Perez June 25, 2003. Hydrophobicity Sliding Window Program Using Functions (1). #include <iostream> #include <string> using namespace std; double hydro[25] = {1.8,0,2.5,-3.5,-3.5,2.8,-0.4,-3.2,4.5,0,-3.9,3.8,1.9,-3.5,0,
E N D
Longest Common Subsequence (LCS) - Scoring Dr. Nancy Warter-Perez June 25, 2003
Hydrophobicity Sliding Window Program Using Functions (1) #include <iostream> #include <string> using namespace std; double hydro[25] = {1.8,0,2.5,-3.5,-3.5,2.8,-0.4,-3.2,4.5,0,-3.9,3.8,1.9,-3.5,0, -1.6,-3.5,-4.5,-0.8,-0.7,0,4.2,-0.9,0,-1.3}; void compute_hydro(string seq, int ws); void main () { string seq; int ws; cout << "This program will compute the hydrophobicity of an sequence of amino acids.\n”; cout << "Please enter the sequence: "<< flush; cin >> seq; cout << "Please enter the window size: "<< flush; cin >> ws; compute_hydro(seq, ws); } LCS Scoring
Hydrophobicity Sliding Window Program Using Functions (2) void compute_hydro(string seq, int ws) { int i; double sum = 0; cout << "\n\nThe hydrophocity values are:" << endl; for(i = 0; i <= seq.size(); i++) if((seq.data()[i] >= 'a') && (seq.data()[i] <= 'z')) seq.at(i) = seq.data()[i] - 32; for(i = 0; i < ws; i++) { sum += hydro[seq.data()[i] - 'A']; } for(i = 1; i <= seq.size() - ws; i++) { cout << "Hydrophocity value:\t" << sum/ws << endl; sum = sum - hydro[seq.data()[i-1] - 'A'] + hydro[seq.data()[i+ws-1] - 'A']; } cout << "Hydrophocity value:\t" << sum/ws << endl; } LCS Scoring
Reference • Computational Molecular Biology – An Algorithmic Approach, Pavel Pevzner LCS Scoring
Longest Common Subsequence (LCS) Problem • Can have insertion and deletions but no substitutions (no mismatches) • Ex: V: ATCTGAT W: TGCATA LCS: TCTA LCS Scoring
LCS Problem (cont.) • Similarity score si-1,j si,j = max { si,j-1 si-1,j-1 + 1, if vi = wj • On board example: Pevzner Fig 6.1 LCS Scoring
Indels – insertions and deletions (e.g., gaps) • alignment of V and W • V = rows of similarity matrix (vertical axis) • W = columns of similarity matrix (horizontal axis) • Space (gap) in W (UP) • insertion • Space (gap) in V (LEFT) • deletion • Match (no mismatch in LCS) (DIAG) LCS Scoring
LCS(V,W) Algorithm for i = 1 to n si,0 = 0 for j = 1 to n s0,j = 0 for i = 1 to n for j = 1 to m if vi = wj si,j = si-1,j-1 + 1; bi,j = DIAG else if si-1,j >= si,j-1 si,j = si-1,j; bi,j = UP else si,j = si,j-1; bi,j = LEFT LCS Scoring
Programming Workshop 5 • Implement the LCS scoring algorithm as a function • Inputs: 2 strings to score • Outputs: Scoring matrix and traceback matrix (these can be global variables) • Write a main functions to • prompt the user for 2 sequences • call the scoring function • print the 2 matrices LCS Scoring