1 / 31

Chapter 7 Bioinformatics

Chapter 7 Bioinformatics. 生物資訊學的發展. 1990 年代 : 人類基因體計劃 1982 年 : 美國國家衛生院 (NIH) 建立了 GenBank 1988 年 : 建立 NCBI (National Center for Biotechnology Information). Definition of Bioinformatics.

premala
Download Presentation

Chapter 7 Bioinformatics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 7 Bioinformatics 1-

  2. 生物資訊學的發展 • 1990年代: • 人類基因體計劃 • 1982年: • 美國國家衛生院(NIH)建立了GenBank • 1988年: • 建立NCBI (National Center for Biotechnology Information) 1-

  3. Definition of Bioinformatics • Research, development, or application ofcomputational tools and approaches forexpanding the use of biological, medical,behavioral or health data including thoseto acquire, store, organize archive,analyze, or visualize such data. 1-

  4. 1-

  5. Why use bioinformatics? • an explosive growth in the amount of biological information • a more global perspective in experimental design.. • data-mining - the process by which testable hypotheses are generated regarding the function or structure of a gene or protein of interest by identifying similar sequences in better characterized organisms. From http://www.ncbi.nlm.nih.gov 1-

  6. 生物資訊分類 • 生物資訊可略分為四類: • 有關生物之結構、形態、顏色等巨觀及微觀之資訊 • 生物遺傳物質DNA及基因體序列及其特性的資訊 • 生物大分子如蛋白質及碳水化合物結構與特性之資訊 • 其他有關生物之生化、生理、遺傳、演化等之特性 1-

  7. Types of bioinformatics tools • Database • Software • Web resource • 演算法 • 圖像及訊號處理 • 電腦架構及資料庫管理 • 電腦語言 • 程式設計 • 人工智慧及訊息理論 • 設計與模擬作業 • 數值分析 • 統計學 • 軟體工程及自動化 1-

  8. 主要生物資訊網站 • NCBI (National Center for Biotechnology Information) • ExPASy (Expert Protein Analysis System) • EMBnet (European Molecular Biology network) 1-

  9. 主要的核酸與蛋白質資料庫 • GenBank(美國), EMBL (歐洲) 及DDBJ (日本) • PDB/RCSB(Protein Database), PIR(Protein Information Resource), Pfam(Protein Family database) 1-

  10. 1-

  11. 解析生物資訊之網路工作站 • EMBOSS (European Molecular Biology Open Software Suite) • SDSC-Biology Workbench 1-

  12. 生物資訊學之應用 • (1) 數據取得及處理 • (2) 基因定位 • (3) 基因體圖譜及比較 • (4) 分子模型構築及模擬 • (5) DNA及蛋白質序列及結構比較 • (6) 大分子結構預測及藥物設計 • (7) 分子演化等領域。 1-

  13. 1-

  14. 1-

  15. 1-

  16. DNA Sequencing • Acquire Sample information, chromatograms, assembled data • Store Data and information, backup data • Analyze Quality assessment, filter and assemble data • Predict and discover gene function • Study genetic variation and gene expression • Distribute Data to collaborators and customers • Research findings to the scientific community 1-

  17. 發現新基因--傳統方法 找到帶病(突變)的個體 比較正常/變異個體gene 表現不同之處 尋找突變gene 發現致病gene 1-

  18. 發現新基因--genomics Sequence Data Gene Finding Function Prediction Novel Gene?? 1-

  19. 發現致病基因--genetics linkage 帶遺傳疾病的個體 利用家族圖譜尋找genetic marker與疾病遺傳的關係 找到致病gene 找到與致病gene有關的marker 1-

  20. 1-

  21. Some Problems in Bioinformatics • Sequence comparison • Fragment assembly of DNA sequences • Physical mapping • Evolutionary trees • Molecular structure prediction 1-

  22. Sequence Comparison • Goals: • Database search: Given a sequence S and a set of sequences G, to find all the sequences in G, which are similar to S. • Similarity: To find which parts of the sequences are alike and which parts differ. - Sequence alignment (global alignment) - Local alignment 1-

  23. Sequence Alignement • Global alignment • Local alignment 1-

  24. Longest Common Subsequence(1) • To find a longest common subsequence between two strings. string1: TAGTCACG string2: AGACTGTC  LCS : AGACG • Dynamic programming: 1-

  25. Longest Common Subsequence(2) TAGTCACG AGACTGTC LCS: 1-

  26. Edit Distance(1) • To find a smallest edit process between two strings. TAGTCACG AGACTGTC Operation: DMMDDMMIMII 1-

  27. Edit Distance(2) TAGTCACG AGACTGTC 1-

  28. Similarity • Two sequences s1 and s2. • p is the match value if ai = bj, else it is the mismatch value. • g is the gap penalty. 1-

  29. Sequence Alignment a = TAGTCACG b = AGACTGTC  ----TAGTCACG TAGTCAC-G-- AGACT-GTC--- -AG--ACTGTC • Which one is better? 1-

  30. Sequence Alignment Formula c0,0 = 0 ci,0 =  i c0,j =  j if ai bj if ai = bj 1-

  31. Sequence Alignment Example TAGTCAC-G-- -AG--ACTGTC 1-

More Related