1 / 34

TB Data Visualization and correlations in TB Patient Networks

TB Data Visualization and correlations in TB Patient Networks. Outline. 1. Spoligoforests 2. Correlations in Spoligoforests 3. Patient graphs. Outline. 1. Spoligoforests 2. Correlations in Spoligoforests 3. Patient graphs. 1. Spoligoforests.

Download Presentation

TB Data Visualization and correlations in TB Patient Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. TB Data Visualization and correlations in TB Patient Networks

  2. Outline 1. Spoligoforests 2. Correlations in Spoligoforests 3. Patient graphs

  3. Outline 1. Spoligoforests 2. Correlations in Spoligoforests 3. Patient graphs

  4. 1. Spoligoforests The 3-step algorithm to decide the deletion events in the spoligoforest uses two assumptions: a) Hidden Parent Assumption: Each spoligotype loses one or more contiguous spacer in a deletion event. b) Single Inheritance: Each spoligotype mutates from one spoligotype.

  5. Child node and its possible parents • Hidden Parent Assumption assigns possible parents to a child node. Each node represents a spoligotype in a spoligoforest. • Before applying Single Inheritance, each node has multiple parents, which means that there are multiple sources of mutation which results in the spoligotype of the child node. • We find the unique and most likely source of mutation by Single Inheritance.

  6. 1. Spoligoforests - MAKESPOLIGOFOREST algorithm

  7. MAKESPOLIGOFOREST ALGORITHM HPA MiruHamming SpolHamming MiruL2 RandomPick

  8. CDC DATA

  9. East African Indian Indo Oceanic M. africanum Euro-American M. bovis East Asian

  10. Genetic Diversity of TB in US

  11. Genetic Diversity of TB in NYC NYC Isolates

  12. Tanaka’s Model • Unambiguous edges (mutations, deletions): After applying Hidden Parent Assumption, some nodes in the spoligoforest have exactly one parent node. So, there is no need to apply Single Inheritance rule. • Tanaka et al. found out that Length of deletion frequency of unambiguous edges follows Zipf distribution.

  13. Tanaka’s Model: Use of Zipf distribution and Single Inheritance • After assigning edge weights to all possible deletions according to this model, Tanaka’ s model pick the unique parent by choosing the deletion with maximum weight.

  14. Outline 1. Spoligoforests 2. Correlations in Spoligoforests 3. Patient graphs

  15. 2. Correlations in Spoligoforests Outdegree distribution vs. Outdegree: Follows Zipf distribution. Zipf Distribution: Preferential Attachment. Rich-gets-richer model. Outdegree of a spoligotype in the spoligoforest: The number of spoligotypes this spoligotype can mutate into by a deletion event.

  16. Outdegree distribution vs. Outdegree

  17. Outdegree distribution vs. Outdegree by major lineages

  18. 2. Correlations in Spoligoforests Length of frequency distribution vs. Length of Frequency: Follows Zipf Distribution Zipf Distribution: Preferential Attachment. Rich-gets-richer model. We take all edges in the spoligoforest into account, compared to unambiguous edges only approach in Tanaka’s model.

  19. Length of frequency distribution vs. Length of Frequency

  20. Outline 1. Spoligoforests 2. Correlations in Spoligoforests 3. Patient graphs

  21. Patient Graphs – NYC Data 4984 Patients 137 Countries 793 Spoligotypes 2648 RFLPs 3235 Distinct Genotypes 594 “Named” Clusters

  22. Patient Graphs – Questions Is there a Patient-Pathogen trend that TB transmission follows? Is the demographic distribution of the patients infected by the bacteria of same genotype uneven? How can we fit a TB transmission and mutation model, given that the environment, such as the location on the world map, affects the transmission of TB?

  23. M. bovis

  24. M. africanum

  25. East Asian

  26. East-African Indian

  27. Euro American

  28. Indo Oceanic

  29. Named clusters of interest: Cluster 3 Spoligotype: S00030 RFLP: C(3) 166 patients Euro-American

  30. Named clusters of interest: Cluster 33 Spoligotype: S00034 RFLP: W(18) 21 patients East Asian W-Beijing

  31. Named clusters of interest: Cluster 4 Spoligotype: S00009 RFLP: H(2) 99 patients Euro-American

  32. Named clusters of interest: Cluster 29 Spoligotype: S00034 RFLP: N3(13) 21 patients East Asian

  33. Questions Does the high transmission rate in an area increase the likelihood of mutation? How do MIRUs mutate? Is there a pattern of deletion events or an assumption such as Hidden Parent Assumption for 12-bit MIRU? Can we map the patterns of mutation events in SNPs of MIRU to 12-bit MIRU?

More Related