1.1k likes | 1.25k Views
Introduction to. Bioinformatics. GENERAL INFORMATION Course Methodology The course consists of the following components; i. a series of 10 lectures and 10 mini-exams, ii. 7 skills classes, each with one programming task, iii. one final written exam.
E N D
Introduction to Bioinformatics
GENERAL INFORMATION • Course Methodology • The course consists of the following components; i. a series of 10 lectures and 10 mini-exams, ii. 7 skills classes, each with one programming task, iii. one final written exam. • In the lectures the main theoretical aspects will be presented. • Each lecture starts with a "mini-exam" with three short questions belonging to the previous lecture. • In the skills classes (SCs) several programming tasks are performed, one of which has to be submitted until next SC. • Finally ,the course terminates with a open-book exam.
GENERAL INFORMATION 10 lectures and 10 mini-exams Prologue (In praise of cells)Chapter 1. The first look at a genome (sequence statistics)Chapter 2. All the sequence's men (gene finding)Chapter 3. All in the family (sequence Alignment)Chapter 4. The boulevard of broken genes (hidden Markov models)Chapter 5. Are Neanderthals among us? (variation within and between species)Chapter 6. Fighting HIV (natural selection at the molecular level)Chapter 7. SARS: a post-genomic epidemic (phylogenetic analysis)Chapter 8. Welcome to the hotel Chlamydia (whole genome comparisons)Chapter 9. The genomics of wine-making (Analysis of gene expression)Chapter 10. A bed-time story (identification of regulatory sequences)
GENERAL INFORMATION mini-exams * First 15 minutes of the lecture * Closed Book * Three short questions on the previous lecture * Counts as bonus points for the final mark … * There is a resit, where you can redo individual mini’s you failed to attend with a legitimate leave
GENERAL INFORMATION Skills Class: * Each Friday one hour hands-on with real data * Hand in one-a-week – for a bonus point
GENERAL INFORMATION Final Exam: * 10 short questions regarding the course material * Open book
GENERAL INFORMATION Grading: The relative weights of the components are: i. 10 mini-exam: B1 bonus points (max 1) ii. 7 skills class programming task: B2 bonus points(max 1) iii. final written exam (open-book, three hours): E points(max 10) Final grade = min(E + (B1 + B2), 10) Study Points: 6 ECTS/ 4 NSP
GENERAL INFORMATION Course Book: Introduction to Computational Genomics A Case Studies Approach Nello Cristianini, Matthew W. Hahn
GENERAL INFORMATION • Additional recommended texts: • Bioinformatics: the machine learning approach, Baldi & Brunak. • Introduction to Bioinformatics, Lesk, and: Introduction to Bioinformatics, Attwood & Parry-Smith.
Introduction to Bioinformatics. LECTURES
Introduction to Bioinformatics. LECTURE 1: * Prologue (In praise of cells) * Chapter 1. The first look at a genome (sequence statistics)
Introduction to Bioinformatics. Prologue : In praise of cells * Nothing in Biology Makes Sense Except in the Light of Evolution(Theodosius Dobzhansky)
GENOMICS and PROTEOMICS Genomics is the study of an organism's genome and the use of the genes. It deals with the systematic use of genome information, associated with other data, to provide answers in biology, medicine, and industry. Proteomics is the large-scale study of proteins, particularly their structures and functions. Proteomics is much more complicated than genomics. Most importantly, while the genome is a rather constant entity, the proteome differs from cell to cell and is constantly changing through its biochemical interactions with the genome and the environment. One organism will have radically different protein expression in different parts of its body, in different stages of its life cycle and in different environmental conditions.
Development of Genomics/ Proteomics Databases
modern map-makers have mapped the entire human genome Hurrah – we know the entire 3.3 billion bps of the human genome !!! … but what does it mean ???
How can we measure metabolic processes and gene activity ???
EXAMPLE: Caenorhabditis elegans
Some fine day in 1982 …
Until recently we lacked tools to measure gene activity 1989 saw the introduction of the microarray technique by Stephen Fodor But only in 1992 this technique became generally available – but still very costly
Ontwikkelde microarray Microarray-ontwikkelaar Microarray Stephen Fodor Until recently we lacked tools to measure gene activity 1989 saw the introduction of the microarray technique by Stephen Fodor But only in 1992 this technique became generally available – but still very costly
Using the microarray technology we can now make time series of the activity of our 22.000 genes – so-called genome wide expression profiles
The identification of genetic pathways from Microarray Timeseries Sequence of genome-wide expression profiles at consequent instants become more realistic with decreasing costs …
Now the problem is to map these microarray-series ofgenome-wide expression profilesinto something that tells us what the genes are actually doing … for instance a networkrepresenting their interaction
DNA Deoxyribonucleic acid (DNA) is a nucleic acid that contains the genetic instructions specifying the biological development of all cellular forms of life (and most viruses). DNA is a long polymer of nucleotides and encodes the sequence of the amino acid residues in proteins using the genetic code, a triplet code of nucleotides.
Genetic code The genetic code is a set of rules that maps DNA sequences to proteins in the living cell, and is employed in the process of protein synthesis. Nearly all living things use the same genetic code, called the standard genetic code, although a few organisms use minor variations of the standard code. Fundamental code in DNA: {x(i)|i=1..N,x(i) in {C,A,T,G}} Human: N = 3.3 billion
Replication of DNA
Genetic code: TRANSCRIPTION DNA → RNA Transcription is the process through which a DNA sequence is enzymatically copied by an RNA polymerase to produce a complementary RNA. Or, in other words, the transfer of genetic information from DNA into RNA. In the case of protein-encoding DNA, transcription is the beginning of the process that ultimately leads to the translation of the genetic code (via the mRNA intermediate) into a functional peptide or protein. Transcription has some proofreading mechanisms, but they are fewer and less effective than the controls for DNA; therefore, transcription has a lower copying fidelity than DNA replication. Like DNA replication, transcription proceeds in the 5' → 3' direction (ie the old polymer is read in the 3' → 5' direction and the new, complementary fragments are generated in the 5' → 3' direction). IN RNA Thymine (T) → Uracil (U)
Genetic code: TRANSLATION DNA-triplet → RNA-triplet = codon → amino acid RNA codon table There are 20 standard amino acids used in proteins, here are some of the RNA-codons that code for each amino acid. Ala A GCU, GCC, GCA, GCG Leu L UUA, UUG, CUU, CUC, CUA, CUG Arg R CGU, CGC, CGA, CGG, AGA, AGG Lys K AAA, AAG Asn N AAU, AAC Met M AUG Asp D GAU, GAC Phe F UUU, UUC Cys C UGU, UGC Pro P CCU, CCC, CCA, CCG ... Start AUG, GUG Stop UAG, UGA, UAA
Protein Structure: primary structure
Protein Structure: secondary Structure a: Alpha-helix, b: Beta-sheet
Protein Structure: super-secondary Structure