1 / 79

Chapter 1

Chapter 1. Introduction. Introduction – Gene( 基因 ) History. 1865 Mendel: The basic unit of inheritance is a gene. Mendel’s work was forgotten until 1900s. 1944 The gene was known to be made of DNA ( D eoxyribo n ucleic A cid) .

goro
Download Presentation

Chapter 1

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 1 Introduction

  2. Introduction – Gene(基因) History • 1865 Mendel: The basic unit of inheritance is a gene. • Mendel’s work was forgotten until 1900s. • 1944 The gene was known to be made of DNA (Deoxyribonucleic Acid). • 1953 James Watson and Francis Crick : Double helical structure of DNA. (雙股螺旋)

  3. Introduction – Gene History (Cont.) • 1990 The Human Genome Project(人類基 因體計畫 ) started. • 1995 The first free-living organism to be sequenced : haemophilus influenzae (流行性感冒嗜血桿菌) • 1998 CELERA joined the gene research. • 2000 The human DNA sequence draft was completed (published in 2001).

  4. Bioinformatics - 國內相關計畫 • 2000年國科會「生物資訊」跨領域研究 • 2001年國科會國家型研究計畫 • 基因體醫學國家型計畫 • 2001年國科會跨領域專題研究 • 工程處:資訊科技 • 生物處:生物資訊

  5. 動物細胞(細胞核、細胞質、細胞膜) • DNA位於細胞核內之「核仁」

  6. DNA Double Helix (雙股螺旋)

  7. DNA Double Helix (雙股螺旋)

  8. DNA中核甘酸間之鍵結

  9. 核甘酸 • 核甘酸(Nucleotide)為核酸分子構成單元 • 核甘酸包含: • 五碳糖(去氧核糖, deoxyribose) • 磷酸基(phosphate group) • 含氮鹼基之一(A、G、C、T、U) 胞嘧啶 (C)

  10. DNA四種含氮鹼基

  11. DNA Double Helix (雙股螺旋)

  12. DNA Sequence

  13. DNA and RNA • Nucleotide (核甘酸): 腺嘌呤 (adenine, A) 鳥糞嘌呤(guanine, G) 胞嘧啶(cytosine, C) 胸腺嘧啶(thymine, T) 尿嘧啶(uracil, U) • DNA(deoxyribonucleic acid , 去氧核糖核酸) {A, G, C, T} (base pair: GC, A=T ) • RNA(ribonucleic acid, 核糖核酸) {A, G, C, U} (base pair: GC, A=U, GU )

  14. DNA Length • The total length of the human DNA is about 3109(30億) base pairs. • 1% ~ 1.5% of DNA sequence is useful. • # of human genes: 30,000~40,000 • Conclusion from the human genome project • Expected # is 100,000 originally.

  15. DNA Sequencing(定序) • Given DNA sequence: TGCACTTGACGCATGCT Cut the sequence after random A: ATGCT length=5 ACGCATGCT length=9 AACGCATGCT length=10 ACTTGAACGCATGCT length=15

  16. DNA Sequencing • 電泳法(eletrophoresis)

  17. DNA Sequencing

  18. Amino Acids (胺基酸) 胺基酸:蛋白質的基本單位,共20種

  19. General Structure of an Amino Acid 3 groups: Amino Group (胺基) Carboxyl Group (羧基) R Group (R 基團)

  20. Amino Acids (胺基酸)分子

  21. Amino Acids (胺基酸)分子

  22. Protein (蛋白質)分子

  23. Amino Acids and RNA 每三個核甘酸(codon,基因密碼)對應至一種胺基酸。 AUG is also the “start” codon.

  24. From DNA via RNA to Protein

  25. DNA TCCAACGGTGCTGAGGTGCAC Protein Gene DNA, Genes and Proteins • DNA: program for cell processes • Proteins: execute cell processes

  26. Promoter(啟動子) and Gene

  27. Regulation (調控) of Genes Transcription Factor (Protein) RNA polymerase (Protein) DNA Gene Regulatory Element By Blanchette

  28. Regulation of Genes Transcription Factor (Protein) RNA polymerase DNA Regulatory Element Gene By Blanchette

  29. Regulation of Genes New protein RNA polymerase Transcription Factor DNA Regulatory Element Gene By Blanchette

  30. From DNA via RNA to Protein

  31. From RNA to Protein

  32. From RNA to Protein

  33. Primary Structure (一級結構) of Protein 牛的胰島素(一種蛋白質)之胺基酸序列

  34. Secondary Structure (二級結構) of Protein

  35. Tertiary Structure (三級結構) of Protein 血紅素分子三級結構

  36. Quaternary Structure (四級結構) of Protein 血紅素分子四級結構

  37. Problems on Different Levels

  38. Some Problems in Bioinformatics • Sequence comparison • Longest common subsequence • Edit distance • Similarity • Multiple sequence alignment • Fragment assembly of DNA sequences • Shortest common superstring • Physical mapping • Double digest problem • Consecutive ones problem • Evolutionary trees • Molecular structure prediction • Protein folding

  39. Sequence Comparison • Goals: • Database search: Given a sequence S and a set of sequences G, to find all the sequences in G, which are similar to S. • Similarity: To find which parts of the sequences are alike and which parts differ. - Sequence alignment (global alignment) - Local alignment

  40. Sequence Alignement • Global alignment • Local alignment

  41. Longest Common Subsequence(1) • To find a longest common subsequence between two strings. string1: TAGTCACG string2: AGACTGTC  LCS : AGACG • Dynamic programming:

  42. Longest Common Subsequence(2) S2 S1 TAGTCACG AGACTGTC LCS:AGACG

  43. Edit Distance(1) • To find a smallest edit process between two strings. S1: TAGTCACG S2: AGACTGTC Operation: DMMDDMMIMII

  44. Edit Distance(2) S2 S1 TAGTCACG AGACTGTC DMMDDMMIMII

  45. Similarity • Two sequences s1 and s2. • p is the match value if ai = bj, else it is the mismatch value. • g is the gap penalty.

  46. Sequence Alignment a = TAGTCACG b = AGACTGTC  ----TAGTCACG TAGTCAC-G-- AGACT-GTC--- -AG--ACTGTC • Which one is better?

  47. Sequence Alignment Formula c0,0 = 0 ci,0 =  i c0,j =  j if ai bj if ai = bj

  48. Sequence Alignment Example TAGTCAC-G-- -AG--ACTGTC

  49. Multiple Sequence Alignment s1 = ATTCGAT s2 = TTGAG s3 = ATGCT  alignment s1 = ATTCGAT s2 = -TT-GAG s3 = AT--GCT • If the number of sequences is k, and k is large, how to solve the problem? • NP-complete problem

  50. Multiple Sequence Alignment - SP • Sum-of-pairs score =

More Related