1 / 12

Probing the systems biology of Mycobacterium tuberculosis through gene expression and genomic data

Probing the systems biology of Mycobacterium tuberculosis through gene expression and genomic data . Luke Alden Yancy, Jr. Mentor: Robert Riley Broad Institute of MIT & Harvard Cambridge, MA. What is Tuberculosis?.

nicodemus
Download Presentation

Probing the systems biology of Mycobacterium tuberculosis through gene expression and genomic data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Probing the systems biology of Mycobacterium tuberculosis through gene expression and genomic data Luke Alden Yancy, Jr. Mentor: Robert Riley Broad Institute of MIT & Harvard Cambridge, MA

  2. What is Tuberculosis? Source: http://staff.vbi.vt.edu/pathport/pathinfo_images/Mycobacterium_tuberculosis/AerosolTransmission.jpg

  3. The Problem TB mortality, all forms (per 100 000 population per year), By Country, Total, 2006 Source: WHO Stop TB Department, website: www.who.int/tb

  4. Why this study? • Biclustering • Bimax (Prelic et al. 2006) • CC (Cheng and Church, 2000) • Plaid Model (Turner et al. 2003) • Spectral (Kluger et al. 2003) • Xmotifs (Murali and Kasif, 2003) • Traditional Clustering • K-Means (MacQueen, 1967) • Hierarchical (Eisen et al. 1998) • Learn more about Mycobacterium Tuberculosis (Mtb) using analysis of gene expression data

  5. What are clustering and biclustering?

  6. Biclustering vs. Standard Clustering Source: Machine Learning and Its Applications to Biology, Tarca et al. 2007. (Editor: Fran Lewitter, Whitehead Institute)

  7. What did we do? Bimax K-Means Boshoff Data (Processed: 3924 Genes, 359 Experiments) Clusters of Genes Source: The Transcriptional Responses of Mycobacterium tuberculosis to Inhibitors of Metabolism. (Boshoff et al. 2004)

  8. Benchmarking Biclusters Using Operons (proS loci of Mtb ) (N) Significance of overlap k estimated using hypergeometric distribution: Operon Cluster (m) (k) (n) Gene Pair (Source: http://www.nature.com/nature/journal/v409/n6823/full/4091007a0.html)

  9. Algorithm Performance Bimax Biclustering Operon Overlap Source: Prolinks: a database of protein functional linkages derived from coevolution (Bowers et al. 2005)

  10. Problems with Biclustering • Random step – lacks reproducibility • No biological soundness • Artificial arrangement of data • Large data sets produce statistically significant, but small clusters • Practicality • Implementation • Large Input Data Sets

  11. Conclusions & Next Steps • K-Means clustering performs better than biclustering on our data set • Next, use motif recognition methods to identify regulatory motifs in clusters • Further development of improved biclustering algorithms

  12. Acknowledgments • Project Team Robert Riley (Mentor) Brian Weiner • The Broad Institue Eric Lander Core Members SRPG Program Members • Summer Research Program in Genomics (SRPG) • Shawna Young • Bruce Birren • Lucia Vielma • Maura Silverstein

More Related