260 likes | 619 Views
Introduction to the course. Bioinformatics and Genome Data Analysis. Fredj Tekaia Institut Pasteur tekaia@pasteur.fr. • Access to entire genome sequences is revolutionizing our understanding of how genetic information is stored and organized in DNA, and how it has evolved over time.
E N D
Introduction to the course Bioinformatics and Genome Data Analysis Fredj Tekaia Institut Pasteur tekaia@pasteur.fr
• Access to entire genome sequences is revolutionizing our understanding of how genetic information is stored and organized in DNA, and how it has evolved over time. • The sequence of a genome provides significant details of the gene catalogue within a species. • Recent analysis and comparisons of complete genome sequences show the acceleration in the understanding of species organisation, links between genes, functions of the genes, evolution of genes of genomes and of species.
207 21 Complete genomes • >1387 projects • 261 published (01-03-05) • 654 prokaryotes • 472 eukaryotes 33 http://www.genomesonline.org/
95 96 97 98 99 00 01 02 03 04 03-05 Number of available completely sequenced genomes
Huge amount of data with exponential increase
• Few scientists are involved in generating data and much more are involved in data analysis
Genomics Is the study of genomes. (including the study of their genes, proteins, their sequences, functions, evolution,....)
Bioinformatics is needed to transform the flood of raw data (sequences) into scientific knowledge.
Mathematics Statistics Bioinformatics Nucleotide & protein sequences and related informations Informatics Biology
Bioinformatics: Is defined as the mathematical, statistical, computing methods and tools that, based on sequences and related informations, aims at solving biological questions.
Biological problem • Pairwise sequence alignment • Search for similar sequences • Multiple sequence alignment • Phylogenetic tree reconstruction • Protein 3D structure alignment Math/Stat/Info Methods • Dynamic programming (DP); • Markov Chain, Monte Carlo, Gibbs samplers; • Hidden Markov models (HMM); • Motif extraction • Functional site prediction • Coding region prediction • Transmembrane domain prediction • Protein 3D structure prediction • Discriminant analysis; • Neural networks; • Hidden Markov models (HMM); • Formal grammar; • Superfamily classification Ortholog/paralog clustering of genes • Clustering algorithms Hierarchical, k-means, ...; • CA, PCA, MDS, ...;
Information system: http://www.ncbi.nlm.nih.gov Bibliography
Bibliography: http://www.ncbi.nlm.nih.gov/PubMed http://www.plosbiology.org
GENOMED-HEALTH 2005 Institut Pasteur Tunis 04-06 March 2005
Internet means data has no geographical boundaries • Local application • Local generation/integrations of data • Local resources • But International discoveries Winston Hide SANBI
Access to the genome in developing countries is limited • Lack of Internet access (State responsibility) • Lack of infrastructure to exploit genome information (State responsibility) • Lack of skills in interpretation • although skills exist in Mathematics, Computer science, Statistics, Biology,.... Winston Hide SANBI
Recommendations •Technological Infrastructure •Human resourses •Capacities building
Bioinformatics and Genome Data Analysis This advanced course aims at bringing multidisciplinary scientists around : a)What we learned from completely sequenced genomes. (Conferences: 1/2 time) b) Algorithms used in sequence and genome analysis (Lectures (1/4) and practical sessions (1/4)).
Day 1: Thursday - March 24 : 09 - 12: Introduction to Unix and perl programming Fredj Tekaia - Institut Pasteur Paris 14 - 18: Introduction to Useful Algorithms and multivariate analysis methods. Ahmed Rebai - Biotechnology Center Sfax Day 2: Friday - March 25: 09 - 12 : Introduction to Molecular Biology data Odile Kalogeropoulos – Institut Pasteur Paris 14 - 18: Introduction to WEB resources in Bioinformatics and Genomes. Marie-Paule Lefranc– Montpellier II University
Day 3: Saturday - March 26: 09 - 11: conference : “Genome annotation : What can we learn in bacterial biochemical reactions ?” Georges Cohen – Institut Pasteur Paris 11H30 – 13H : conference : “The human genome : impact in the biomedical domain” . Sonia Abdelhak – Institut Pasteur Tunis Day 4: Sunday - March 27 : Social program: visit to ElDjem -
Day 5: Monday - March 28 : 9H – 12H : conference: “How eukaryotic genomes evolve: the Yeast example”. Bernard Dujon - Institut Pasteur Paris 14H - 18H : Theoretical and practical session : Substitution matrices – Algorithms for sequence comparisons ; Ahmed Rebai - Fredj Tekaia Day 6: Tuesday - March 29: 9H – 12H : conference : « Genomes of bacterial pathogenes and their diversity ». Philippe Glaser – Institut Pasteur Paris 14 - 18 : Practical session: Large scale proteome comparisons. Fredj Tekaia - Ahmed Rebai
Day 7: Wednesday - March 30 : 9H – 13H : Conference : « Principles and methods for the analysis of genes and genomes sequences ». Edouard Yeramian – Institut Pasteur Paris 14 - 18 : Motif detection - Multiple alignment. Ahmed Rebai - Fredj Tekaia Day 8: Thursday - March 31: 9H – 10H30 : Conference : « Transcriptional profiling of the hypha-to-yeast transition of the pathogenic fungus Paracoccidioides brasiliensis » Gustavo H. Goldman - Universidade de Sao Paulo, Brazil 11H- 12H30 «Large scale proteome comparisons». Fredj Tekaia
14 - 18 : « Molecular evolution and Phylogeny ». Fredj Tekaia - Ahmed Rebai Day 9: Friday - April 1st : 9H-12H : conference : « Genome organization and genome evolution ». Giorgio Bernardi - Stazione Zoologica Anton Dohrn – Napoli, Italy 14 - 18H: « Molecular Evolution and Phylogeny » Fredj Tekaia - Ahmed Rebai Day 10: Saturday - April 2 : 09H – 12H : conference : « Introduction to Structural Bioinformatics ». Michael Nilges – Institut Pasteur Paris 14 - 16H : Course evaluation and end.
Aknowledgement: •The Institut Pasteur Paris (http://www.pasteur.fr) Direction des Affaires Internationales (Mme Michèle Boccoz, Dr Jean-Luc Guesdon and Mme Leclerc). • ICGEB (http://www.icgeb.trieste.it/) Decio Ripandelli, Director, Administration and External Relations •EMBO (http://www.embo.org/projects/world/) Mary Gannon, Programme Manager Kathy Oswald EMBO Courses and Workshops •IUBMB (International Union of Biochemistry and Molecular Biology) (http://www.iubmb.unibe.ch/) Prof. Mary Osborn, President IUBMB Max Planck Institute for biophysical Chemistry, Göttingen,Germany. Prof. Jan Joep de Pont, Treasurer Radboud University Nijmegen, The Netherlands. •ICRO (International Cell Research Organization) (http://www.unesco.org/ngo/icro/) Prof. Georges Cohen, Institut Pasteur Paris Mme Claudine SCULLINO
•Marie-Paule LEFRANC (lefranc@ligm.igh.cnrs.fr) Université Montpellier II IMGT, the international ImMunoGeneTics informationsystem® http://imgt.cines.fr •Georgio Bernardi (bernardi@szn.it) Stazione Zoologica Anton Dohrn; Villa Comunale8012, Napoli; Italy •Sonia Abdelhak (sonia.abdelhak@pasteur.rns.tn) Institut Pasteur Tunis •Ahmed Rebai (ahmed.rebai@cbs.rnrt.tn) Centre de Biotechnologie Sfax •Gorges Cohen (gncohen@pasteur.fr) •Bernard Dujon ( bdujon@pasteur.fr) •Odile Kalogeropoulos (odozier@pasteur.fr) •Philppe Glaser (glaser@pasteur.fr) •Michael Nilges (nilges@pasteur.fr) •Edouard Yeramian (yeramian@pasteur.fr) •Fredj Tekaia (tekaia@pasteur.fr) Institut Pasteur Paris •Gustavo Goldman (ggoldman@usp.br) Universidade de São Paulo, Ribeirão Preto, Brazil
Hamed Ben Dhia (President of the University) Ali Gargouri (Discussion and comments along the talks)