330 likes | 584 Views
Bioinformatics, Computational Biology — An Introduction. “…the most wondrous map ever produced by mankind” — Bill Clinton. DNA. Post Genome Era Why small variation , BIG DIFFERENCE ?. The difference between you & chimp is ~1.24% The difference between you and Maggie is ~0.1%.
E N D
“…the most wondrous map ever produced by mankind” — Bill Clinton
Post Genome Era Whysmall variation, BIG DIFFERENCE? • The difference between you & chimp is ~1.24% • The difference betweenyou and Maggie is~0.1%
Genetics: From DNA to population Source: gsk
Introduction – Gene History • 1865 Mendel: The basic unit of inheritance is a gene. • Mendel’s work was forgotten until 1900s. • 1944 The gene was known to be made of DNA (Deoxyribonucleic Acid). • 1953 James Watson and Francis Crick : Double helical structure of DNA. (雙股螺旋)
Introduction – Gene History (Cont.) • 1990 The Human Genome Project (人類基 因體計畫 ) started. • 1995 The first free-living organism to be sequenced : haemophilus influenzae (流行性感冒嗜血桿菌) • 1998 CELERA joined the gene research. • 2000 The human DNA sequence draft was completed (published in 2001).
動物細胞(細胞核、細胞質、細胞膜) • DNA位於細胞核內之「核仁」
DNA Length • The total length of the human DNA is about 3109(30億) base pairs. • 1% ~ 1.5% of DNA sequence is useful. • # of human genes: 30,000~40,000 • Conclusion from the human genome project • Expected # is 100,000 originally.
DNA/RNA 核甘酸分子 • 核甘酸(Nucleotide)包含: • 五碳糖(去氧核糖, deoxyribose) • 磷酸基(phosphate group) • 四種含氮鹼基之一(A、G、C、T/U)
Genome “Genomics” DNA mRNA Proteome “Proteomics” Proteins Cell functions Biochemical Context of Genomics and Proteomics
What is Bioinformatics? Deduction of knowledge by computeranalysis of biological data See 988000 pages on this issue on the WWW • Information stored in the genetic code (DNA), protein sequences • Protein 3D structures, chromosome structure • Protein interaction, transcription factor, motif • Micro array gene expression, functional MRI, 2D-gel • Experimental results • Patient statistics • Scientific literature • Analysis tools
Computational Biology & Bioinformatics Computational Biology Biological Hypothesis Formal Specifications Raw Data Algorithms Information ___ Bioinformatics End with Experiments
Key Strategy for Analysis In Biology In Computer Sciences Information Evolution Consensus FESS Sequences Clustering Structures Distance Measurement Functions Data
Key Strategy for System BiologyExperiment Computer Aided DesignSpecification, Simulation and Reverse Engineering
Some Problems in Bioinformatics • Sequence comparison • Longest common subsequence • Edit distance • Similarity • Multiple sequence alignment • Fragment assembly of DNA sequences • Shortest common superstring • Physical mapping • Double digest problem • Consecutive ones problem • Evolutionary trees • Molecular structure prediction • Protein folding
Bioinformatics and Computer Science • Algorithm: all computing problems. • Image processing: 3D images of RNA folds or protein. • Database: massive database and retrieval. • Distributed system and parallel processing: massive storage and accelerating computation.
Conclusion Biology easily has 500 years of exciting problems to work on. -- Donald E. Knuth
Go working for Integrating Nano Cognition Biology Informatics !
Reference – Journals • Bioinfomatics (SCI) • Bulletin of Mathematical Biology (SCI) • Computer Applications in the Biosciences • Journal of Computational Biology (SCI expanded) • Journal of Mathematical Biology (SCI) • Journal of Molecular Biology (SCI) • Nucleic Acids Research (SCI) • Gene (SCI) • Science (SCI)
Reference – Web Sites • BioWeb http://bioweb.uwlax.edu/ • MIT Biology Hypertextbook http://esg-www.mit.edu:8001/esgbio/ • Bioinformatics Related Journals http://www.iscb.org/journals.html • NCBI http://www.ncbi.nlm.nih.gov/