1 / 59

MLST

MLST. Multilocus Sequence Typing. HMC 2012. what is MLST?. powerful population genetics technique DNA sequences of internal fragments of multiple genes identifies allelic variants to characteri ze , subtyp e and classify members of bacterial populations. Population Genetics.

aaron
Download Presentation

MLST

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. MLST Multilocus Sequence Typing HMC 2012

  2. what is MLST? powerful population genetics technique DNA sequences of internal fragments of multiple genes identifies allelic variants to characterize, subtypeand classify members of bacterial populations

  3. Population Genetics Evolutionary Process Evolutionary Pattern Diversity form & function morphology physiology phenotype/genotype allelic diversity mutation recombination natural selection genetic drift migration MLST data

  4. general protocol 1) sampling and isolation of marine Vibrioon TCBS Agar plates 2) restreak and pick single colonies --> liquid cultures

  5. general protocol 3) extract DNA, perform PCR to amplify genes 4) Sanger sequencing (for and rev Primer for each gene) 5) alignment of reads, quality check, computational analysis DNA Extraction Gene Amplification Sequencing, Phylogenetic Analysis

  6. metadata local site (Hopkins and Point Lobos)

  7. metadata • global site (California and New Zealand) 2 sites 2 sites

  8. metadata time (2007 - 2012) source (anemone and water)

  9. Gyrase • Two subunits encoded by gyrA and gyrB • Topoisomerase Type II: • ATP-dependent initiation of the double-strand breakage of DNA • Introduces negative supercoils in DNA • Only present in bacteria: target for antibiotics

  10. Malate dehydrogenase • NAD+-dependent oxidation of malate to oxaloacetate • Involved in several metabolic pathways (e.g. TCA cycle) • Homodimeric soluble 30-35 kDa protein

  11. RecA Protein responsible for maintenance and repair of DNA DNA-dependent ATPase Several functions all related to DNA repair

  12. ompK - 26 kDa outer membrane protein- transmembrane beta-barrel protein involved in ion transport- receptor  for  KVP40,  a  broad-host-range  vibriophage  isolated  from  sea  water

  13. Main Questions • Is there an observable population structure? If so, how does this relate to the ecology of the environment? (Null) Hypothesis: No structure/pattern. “Everything is everywhere.” • How is evolution occurring in the environment? • Rate: Are the genes evolving at a similar rate? Hypothesis:Housekeeping genes are evolving slower than ompK. • Process: How much recombination vs. mutation is observed in populations? Hypothesis: Significant amounts of both recombination and mutation at a fixed ratio within the population.

  14. Data Overview

  15. Rarefaction Analysis Number of alleles (98% similarity) Number of sequences

  16. New Zealand 34 New Zealand 38 2012 2 2012 1 0 0 gyrB mdh 1 0 1 11 12 0 2007-2011 8 2007-2011 12 New Zealand 31 2012 1 0 recA 0 12 2 Number of shared alleles - ‘allele’ defined at 98% sequence similarity 2007-2011 10

  17. Main Questions • Is there an observable population structure? If so, how does this relate to the ecology of the environment? (Null) Hypothesis: No structure/pattern. “Everything is everywhere.”

  18. Concatenatedhousekeeping genes

  19. A homogenous and phylogenetically distant clade Hopkins Point Lobos Water Anemone Years These are different from the rest of the isolates with 100% bootstrap support. Unfortunately, we didn’t isolate any this year.

  20. A ubiquitous, low diversity clade 66 60 82

  21. Recent diversification? 86

  22. population structure, but no clear sorting by geography or time

  23. Individual gene trees gyrB recA mdh

  24. Derived clades gyrB

  25. Derived clades recA

  26. Derived clades mdh

  27. Opportunity to look for recombination • Divergent • Low diversity • 100% bootstrap support • We might expect them to share an evolutionary history

  28. MultiDimensional Scaling Finds the best way to visualize distance information on a 2 or 3 dimensional plot. In this case we used the maximum likelihood estimates for evolutionary distances between isolates. On the plot, evolutionarily distant isolates are far apart while closely related isolates are close together.

  29. The difference between multidimensional scaling and trees • Ais equally related to B, CandD. A B D B MDS 2 C A C D MDS 1

  30. The difference between multidimensional scaling and trees • Ais equally related to B, CandD. • On the tree, A is only close to B A B D B MDS 2 C A C D MDS 1

  31. The difference between multidimensional scaling and trees • Ais equally related to B, CandD. • On the tree, A is only close to B • This can make it hard to visualize categorical clusters A B D B MDS 2 C A C D MDS 1

  32. The difference between multidimensional scaling and trees • Ais equally related to B, CandD. • On the tree, A is only close to B • This can make it hard to visualize categorical clusters • The MDS plot solves this problem by putting closely related species close together on the page A B D B MDS 2 C A C D MDS 1

  33. MDS - multidimensional scaling Site Hopkins Pt.Lobos

  34. MDS - multidimensional scaling Source Water Anemone

  35. MDS - multidimensional scaling Year 2007 2008 2009 2010 2011 2012

  36. ompK phylogenetic tree

  37. Protein similarity network: ompK Pairwise BLAST of all sequences Similarity network not based on alignment Analysis with Cytoscape software Used to represent similarity network between protein families

  38. Protein similarity network: no identity cut off

  39. Protein similarity network: 90% identity cut off

  40. Protein similarity network: 99% identity cut off

  41. Structure Analyzes population structure based on distances at multiple loci for each genotype provided. K = 6 K = 4 K = 4, Sorted by year. 2007 2008 2009 2010 2011 2012

  42. Structure K = 4 K = 4, Sorted by location. Hopkins Point Lobos

  43. Main Questions • Is there an observable population structure? If so, how does this relate to the ecology of the environment? (Null) Hypothesis: No structure/pattern. “Everything is everywhere.” Finding: There is population structure. Difficult to tell what factors determine it.

  44. Main Questions • Is there an observable population structure? If so, how does this relate to the ecology of the environment? (Null) Hypothesis: No structure/pattern. “Everything is everywhere.” • How is evolution occurring in the environment? • Rate: Are the genes evolving at a similar rate? Hypothesis:Housekeeping genes areevolving slower than OmpK. • Process: How much recombination vs. mutation is observed in populations? Hypothesis: Significant amounts of both recombination and mutation at a fixed ratio within the population.

  45. Do genes evolve at similar rates? recA mdh mdh recA gyrB gyrB

  46. What about the outliers?

  47. But ompK is different… ompK gyrB

  48. Positive vs. Negative Selection within Genes: dN/dS recA gyrB

More Related