1 / 23

Alignment

Alignment. 陳致嘉. Questions. Question 1: 假設現在有 1 條未知的蛋白質 , 我們要如何得知它和何種生物較親近 ? Ans:build a protein database,but … . Question 2: 怎麼定義 2 條蛋白質的相似度 ? Ans: Alignment techniques. Alignment. 蛋白質是由胺基酸組成的 , 所以蛋白質又可稱為胺基酸序列 序列排序 (A lignment) 的定義為分析 2 條胺基酸序列的相似度 Pairwise Alignment

Download Presentation

Alignment

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Alignment 陳致嘉

  2. Questions • Question 1:假設現在有1條未知的蛋白質,我們要如何得知它和何種生物較親近? • Ans:build a protein database,but…. • Question 2:怎麼定義2條蛋白質的相似度? • Ans: Alignment techniques

  3. Alignment • 蛋白質是由胺基酸組成的,所以蛋白質又可稱為胺基酸序列 • 序列排序(Alignment)的定義為分析2條胺基酸序列的相似度 • Pairwise Alignment • Multiple sequence Alignment

  4. Compare two sequence • 垂直的線段”| ”表示相同的胺基酸殘基,插入缺洞(gap)而使不等長的2條序列能上下對齊 • A simple example SEQUENSE 1 (query) AGGVLTTQVG | | | | | | SEQUENSE 2 (sebject) AGGVLTQVG SEQUENSE 1 (query) AGGVLTTQVG | | | | | | | | | SEQUENSE 2 (sebject) AGGVL--TQVG

  5. Compare two sequence • 簡單的評分方法 [1]相同的胺基酸殘基分數+1 [2]gap分數-1 [3]extension gap get extension penalty [4]用演算法找出分數最高的組合 • 定義此分數為edit distance

  6. Pairwise alignment techniques • Question:如果我們並不想比較整條蛋白質的相似度,只想看一小段的功能呢? • Global alignment • Local alignment

  7. Use matrix • Question:在比較2條蛋白質時,有時2個不同的胺基酸殘基發生突變使得原本不同的胺基酸變成相同,該怎麼辦? • Ans:生物資訊上使用矩陣來分析突變的可能性

  8. Use matrix • The Dayhoff Mutation Data Matrix [2條蛋白質有85%以上胺基酸相等時使用] • The BLOSUM matrices [2條蛋白質有85%以下胺基酸相等時使用]

  9. The Dayhoff Mutation Data Matrix • 為一種相關機率(relatedness odds)矩陣 • 矩陣中值大於0的元素所對應的兩個殘基之間發生突變的可能性較大 • 值小於0的元素所對應的兩個殘基之間發生突變的可能性較小

  10. The Dayhoff Mutation Data Matrix

  11. The Dayhoff Mutation Data Matrix

  12. The BLOSUM matrices? • The technique is almost the same!

  13. Use Algorithm • 為求簡單,先不考慮突變的可能性,使用演算法計算2條蛋白質間的相似度 • Global alignment Needleman-Wunsch algorithm • Local alignment Smith-Waterman algorithm

  14. Needleman-Wunsch algorithm

  15. Needleman-Wunsch algorithm

  16. Needleman-Wunsch algorithm

  17. Needleman-Wunsch algorithm

  18. Needleman-Wunsch algorithm

  19. Smith-Waterman algorithm

  20. Software • FastA系列:靈敏、但是速度較慢 • Blast系列:搜尋速度很快,但在序列相似性較低時會有失誤

  21. Multiple sequence Alignment • 多序列排序(Multiple sequence Alignment)是對一個或多個蛋白質進行分析並將之歸類於某個基因家族 • 作蛋白質演化分析 • 是建立protein database的基礎

  22. Demo (use Jalview)

  23. Reference • Introduction to bioinformation (Teresa K Attwood/David J.Parry-Smith著) • http://www.cs.fiu.edu /~giri/teach • http:// www.chinagenenet.com/commInfo

More Related