150 likes | 262 Views
CHAPTER 4 The Sequence Alignment Problem 演講者:高瑜珮 組員:吳智銘、鄭惠玲、黃文勝、 謝志忠、張豊明、許雯琇. The Sequence Alignment Problem. 我們可以用下面兩個 序列 (Sequence) 來說明 序列調整的問題 (The Sequence Alignment Problem) S1=GAACTG ;S2=GAGCTG GAACTG--- GAACTG GA---GCTG GAGCTG
E N D
CHAPTER 4 The Sequence Alignment Problem 演講者:高瑜珮 組員:吳智銘、鄭惠玲、黃文勝、 謝志忠、張豊明、許雯琇
The Sequence Alignment Problem • 我們可以用下面兩個序列(Sequence)來說明序列調整的問題(The Sequence Alignment Problem) S1=GAACTG ;S2=GAGCTG GAACTG--- GAACTG GA---GCTG GAGCTG • 我們可以發現前者是比後者好 調整後 調整前
用計算分數的方式來解問題 • ai 和 bj是兩個序列 1. If ai is aligned with bj and ai =bj , the score is +2. 2. If ai or bj is aligned with a blank(-),the score is -1. 3. If ai is aligned with bj and ai≠bj, the score is -1. • 所以我們可以得到S1=abbcad ; S2=aecb 調整後計算出來的分數為: ab-bcad aecb--- 2x(+2)+5x(-1)=-1
用Dynamic Programming Approach • 讓 A(i.j)表示ai和bi調整過後的最理想分數 • 公式: A(0,0)=0 A(i,0)=-i A(0,j)=-j A(i,j)= 1.max{A(i-1,j-1)-1,A(i-1,j)-1,A(i,j-1)-1},如果ai≠bj. 2.A(i-1,j-1)+2,如果ai=bj.
圖一 Let S1= abbcad ; Let S2 = eacb
我們可以用下列方式推算出結果: 1.if(i.j)point to (i-1,j-1),ai is aligned with bj. 2.if(i.j)point to (i-1,j), ai is aligned with blank(-). 3.if(i.j)point to (i,j-1), bj is aligned with blank(-). • 由圖一推算出結果 結果: -abbcad -abbcad -abbcad eacb--- ea--cb- ea--c-b
The Local Alignment Problem • 公式: A(i,j)=max{0, A(i-1,j-1)+ α(i.j), A(i.j-1)+α(i.-), A(i-1,j)+α(-,j)} • 範例: S1=abbcdae ; S2=afgfde
圖二 由圖二,可得到結果 dae d-e
The Affine Gap Penalty • 定義: A long gap(缺口) is often more preferable than serveral gap.the Affine Gap Penalty can solve this problem.
S1=ACTTGATCC, S2=AGTTAGTAGTCC為兩個序列. 1.最理想的調整結果 ACTT-G-A-TCC AGTTAGTAGTCC score=8x2-1-3x(4+1x1)=0 2.調整後只有一個Gap(缺口)的結果: ACTT---GATCC AGTTAGTAGTCC score=6x2-3x1-(4+3x1)=2 Pg+KPe Pg=4, Pe=1
公式: A(0.0)=A2(0,0)=A3(0,0)=0 A2(i,0)=A(i,0)= -Pg-iPe for i>0 A3(0,j)=A(0,j)=-Pg-jPe for j>0 A2(0,j)=A(i,0)=-∞ A(i,j)=max{A1(i,j),A2(i.j),A3(i,j)} A1(i,j)=A(i-1,j-1)+α(ai,bj) A2(i,j)=max{A2(i-1,j)-Pe,A(i-1,j)-Pg-Pe} A3(i,j)=max{A3(i-1,j)-Pe,A(i-1,j)-Pg-Pe}
結果: ACTT---GATCC AGTTAGTAGTCC
Minimal Spanning Tree S1=ATGCTC, S2=ATGAGC, S3=TTCTG S4=ATGCATGC S1=ATGCTC S2=ATGAGC D(S1,S2)=2 S1=ATGC-T-C S4=ATGCATGC D(S1,S4)=2 S1=ATGCTC S3=TT-CTG D(S1,S3)=3 D(S2,S3)=4 D(S2,S4)=2 D(S3,S4)=4
S1 3 2 S3 S2 2 S4
S1=ATGCTC S2=ATGAGC (S1=ATG-C-TC) S2=ATG-A-GC S4=ATGCATGC S1=ATG-C-TC (S2=ATG-A-GC) S3=TT--C-TG (S4=ATGCATGC) S1 3 2 S3 S2 2 S4