120 likes | 213 Views
Global/Local/Multiple Alignments. by Boyang Wei. GlobalAlignment. How to run: GlobalAlignment -s seq1 seq2 (-m) GlobalAlignment -f file1 file2 (-m) Sample output: rns202-15.cs.stolaf.edu% GlobalAlignment -s ABB AB ABB AB - ABB A - B. GlobalAlignment.
E N D
Global/Local/Multiple Alignments by Boyang Wei
GlobalAlignment How to run: • GlobalAlignment -s seq1 seq2 (-m) • GlobalAlignment -f file1 file2 (-m) Sample output: rns202-15.cs.stolaf.edu% GlobalAlignment -s ABB ABABBAB -ABBA - B
GlobalAlignment // represent each box in the matrixstruct box { int row; // row index int col; // column index int score; // best score vector<box*> from; // where does the score come from};
LocalAlignment How to run: • LocalAlignment -s seq1 seq2 (-m) • LocalAlignment -f file1 file2 (-m) Sample output: rns202-15.cs.stolaf.edu% LocalAlignment -f seq1.txt seq2.txtAC|TAC|T G|TAC|
LocalAlignment Sample output with matrix printed: rns202-15.cs.stolaf.edu% LocalAlignment -f seq1.txt seq2.txt -mAC|TAC|T G|TAC| - A C T A C T - 0 0 0 0 0 0 0 G 0 0 0 0 0 0 0 T 0 0 0 2 1 0 2 A 0 2 1 1 4 3 2 C 0 1 4 3 3 6 5
Global & Local Alignment Input: • both programs require exact two sequences as input • not limited to A, T, G, C • don't require capitalized character • all kinds of characters will work
MultipleAlignment How to run: • MultipleAlignment -s seq1 seq2 seq3 ... • MultipleAlignment -f file Sample output: rns202-15.cs.stolaf.edu% MultipleAlignment -s ATGC ATG ATCATGCATG -AT - C
MultipleAlignment // class to store a character// its objects represent the letters in the sequence// NOTE! since the letter class is set up in this way, // it will store characters other than A, T, G, C as a gap,// so this program only works for input consisting of A, T, G, C struct letter { float A; // percentage of A in this letter float T; // percentage of T in this letter float G; // percentage of G in this letter float C; // percentage of C in this letter float gap; // percentage of gap in this letter ......}
MultipleAlignment // a self-defined string class// to store the sequence/string as a sequence of letter objects struct sequence { vector<letter> seq; // the sequence vector<int> gapPosition; // the gap positions at the end of aligning int prev[2]; // index of the previous two sequences // (sometimes a sequence may be generated by // combining two sequence) ...... }
MultipleAlignment // calculate the score for two letter// either a match, mismatch, or partial match float calculateScore(letter l1, letter l2) { float matchPercent = min(l1.A, l2.A) + min(l1.T, l2.T) + min(l1.G, l2.G) + min(l1.C, l2.C); float misMatchPercent = 1 - matchPercent; return matchPercent*matchScore + misMatchPercent*misMatchScore;}
MultipleAlignment // combine two letters to generate a new oneletter sum(letter l1, letter l2) { // if one of them is a gap, return the other one if (l1.gap == 1) return l2; else if (l2.gap == 1) return l1; // otherwise, combine the percentages else { letter l((l1.A+l2.A)/2, (l1.T+l2.T)/2, (l1.G+l2.G)/2, (l1.C+l2.C)/2, (l1.gap+l2.gap)/2); // could just put 0 for this line return l; }}
MultipleAlignment Input: • requires at least one sequence as input • if read from file: • first line: the number of input sequences • rest lines: one sequence per line