1 / 32

Statistical Bioinformatics

Statistical Bioinformatics. Genomics Transcriptomics Proteomics Systems Biology. Statistical Bioinformatics. Genomics Transcriptomics Proteomics Systems Biology. Multiple Sequence Alignment (MSA). Multiple Sequence Alignments (MSA):. Some past forces shaping MSAs.

mick
Download Presentation

Statistical Bioinformatics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Statistical Bioinformatics • Genomics • Transcriptomics • Proteomics • Systems Biology

  2. Statistical Bioinformatics • Genomics • Transcriptomics • Proteomics • Systems Biology

  3. Multiple Sequence Alignment (MSA)

  4. Multiple Sequence Alignments (MSA):

  5. Some past forces shaping MSAs • Divergence of sequences by speciation and nucleotide substitution (Phylogenetics). • Horizontal gene transfer (recombination), especially in bacteria and viruses.

  6. TOPALi v.1 Recombination detection FrankWright,Iain Milne & Dirk Husmeier

  7. TOPALi applied to Roseburia and Eubacterium sequences

  8. Some past forces shaping MSAs • Divergence of sequences by speciation and nucleotide substitution (Phylogenetics). • Horizontal gene transfer (recombination), especially in bacteria and viruses. • Selective pressure acting on functional domains.

  9. TOPALi v2 Future plans • Detect genomic regions under selective pressure functional domains in proteins • Methodology development: combined prediction of breakpoints due to recombination and evolutionary rate change. • Improved phylogenetic analysis • Investigate use of UK GRID computationalresources for faster analyses

  10. Statistical Bioinformatics • Genomics • Transcriptomics • Proteomics • Systems Biology

  11. Genes differently expressed between two conditions • Affymetrix microarray Mouse liver experiment • Low fat diet vs high fat diet (6 per group) • Plot of log-fold change vs. average log intensity. • Points far away from the horizontal line seem “differentially expressed”. • Which are significant?

  12. Statistical Methods (SAM, Limma,…) help to detect significant genes • BUT: Many methods assume that the variances in both groups are the same • If this is not the case: • Algorithms might give wrong answers • The definition of “differential expression” becomes more difficult

  13. Claus Mayer (BioSS) • More complex statistical tests for detecting differential gene expression. • Situations where standard assumptions are violated. • Allows for different variance-covariance structures in both populations.

  14. Statistical Bioinformatics • Genomics • Transcriptomics • Proteomics • Systems Biology

  15. Proteomics: 2-D Gels How to compare gels 1 and 2? gel 1 gel 2

  16. Chris Glasbey: Nonlinear Warping John Gustafsson, Chalmers University, Sweden WARP

  17. 2-D Gel Comparison Two gels superimposed (in different colours)

  18. Proteomics:2-DGel Interpretation • Graham Horgan • Identify spots which differ between treatments using variance and covariance information from other spots differently expressed proteins • Assessment of associations between spot densities and physiological variables.

  19. Statistical Bioinformatics • Genomics • Transcriptomics • Proteomics • Systems Biology

  20. Detect active pathways in a “known” network • Network of protein-protein and protein-DNA interactions “known” from the literature • Gene expression profiling for different conditions • Bacterial strains: promoting - preventing inflammation • Mice on a low-fat vs. high-fat diet • Can we identify different pathways associated with these conditions? • We need a robust method • Expression data: noisy, missing values • Post-translational modifications

  21. Cytokine Network • Collaboration with SCGTI • Interferon Pathway • Cytokines • Pivotal role in modulating the innate and adaptive mammalian immune system • Network of protein-protein and protein-DNA interactions from the literature • Two gene expression times series from bone marrow-derived macrophages in mice • Infected with cytomegalovirus • Infected and treated with IFN-gamma

  22. Reverse Engineering of Regulatory Networks • Can we learn the network structure from postgenomic data themselves? • Statistical methods to distinguish between • Direct correlations • Indirect correlations • Challenge: Distinguish between • Correlations • Causal interactions • Breaking symmetries with active interventions: • Gene knockouts (VIGs, RNAi)

  23. Evaluation: Raf signalling pathway • Cellular signalling network of 11 phosphorylated proteins and phospholipids in human immune systems cell • Laboratory data from cytometry experiments • Down-sampled to 100 measurements • Sample size indicative of microarray experiments • Two types of experiments: • Passive observations • Active interventions (gene knockouts) • Literature: “gold-standard” network

More Related