1.07k likes | 2.14k Views
MLST. Multilocus Sequence Typing. HMC 2012. what is MLST?. powerful population genetics technique DNA sequences of internal fragments of multiple genes identifies allelic variants to characteri ze , subtyp e and classify members of bacterial populations. Population Genetics.
E N D
MLST Multilocus Sequence Typing HMC 2012
what is MLST? powerful population genetics technique DNA sequences of internal fragments of multiple genes identifies allelic variants to characterize, subtypeand classify members of bacterial populations
Population Genetics Evolutionary Process Evolutionary Pattern Diversity form & function morphology physiology phenotype/genotype allelic diversity mutation recombination natural selection genetic drift migration MLST data
general protocol 1) sampling and isolation of marine Vibrioon TCBS Agar plates 2) restreak and pick single colonies --> liquid cultures
general protocol 3) extract DNA, perform PCR to amplify genes 4) Sanger sequencing (for and rev Primer for each gene) 5) alignment of reads, quality check, computational analysis DNA Extraction Gene Amplification Sequencing, Phylogenetic Analysis
metadata local site (Hopkins and Point Lobos)
metadata • global site (California and New Zealand) 2 sites 2 sites
metadata time (2007 - 2012) source (anemone and water)
Gyrase • Two subunits encoded by gyrA and gyrB • Topoisomerase Type II: • ATP-dependent initiation of the double-strand breakage of DNA • Introduces negative supercoils in DNA • Only present in bacteria: target for antibiotics
Malate dehydrogenase • NAD+-dependent oxidation of malate to oxaloacetate • Involved in several metabolic pathways (e.g. TCA cycle) • Homodimeric soluble 30-35 kDa protein
RecA Protein responsible for maintenance and repair of DNA DNA-dependent ATPase Several functions all related to DNA repair
ompK - 26 kDa outer membrane protein- transmembrane beta-barrel protein involved in ion transport- receptor for KVP40, a broad-host-range vibriophage isolated from sea water
Main Questions • Is there an observable population structure? If so, how does this relate to the ecology of the environment? (Null) Hypothesis: No structure/pattern. “Everything is everywhere.” • How is evolution occurring in the environment? • Rate: Are the genes evolving at a similar rate? Hypothesis:Housekeeping genes are evolving slower than ompK. • Process: How much recombination vs. mutation is observed in populations? Hypothesis: Significant amounts of both recombination and mutation at a fixed ratio within the population.
Rarefaction Analysis Number of alleles (98% similarity) Number of sequences
New Zealand 34 New Zealand 38 2012 2 2012 1 0 0 gyrB mdh 1 0 1 11 12 0 2007-2011 8 2007-2011 12 New Zealand 31 2012 1 0 recA 0 12 2 Number of shared alleles - ‘allele’ defined at 98% sequence similarity 2007-2011 10
Main Questions • Is there an observable population structure? If so, how does this relate to the ecology of the environment? (Null) Hypothesis: No structure/pattern. “Everything is everywhere.”
A homogenous and phylogenetically distant clade Hopkins Point Lobos Water Anemone Years These are different from the rest of the isolates with 100% bootstrap support. Unfortunately, we didn’t isolate any this year.
A ubiquitous, low diversity clade 66 60 82
population structure, but no clear sorting by geography or time
Individual gene trees gyrB recA mdh
Derived clades gyrB
Derived clades recA
Derived clades mdh
Opportunity to look for recombination • Divergent • Low diversity • 100% bootstrap support • We might expect them to share an evolutionary history
MultiDimensional Scaling Finds the best way to visualize distance information on a 2 or 3 dimensional plot. In this case we used the maximum likelihood estimates for evolutionary distances between isolates. On the plot, evolutionarily distant isolates are far apart while closely related isolates are close together.
The difference between multidimensional scaling and trees • Ais equally related to B, CandD. A B D B MDS 2 C A C D MDS 1
The difference between multidimensional scaling and trees • Ais equally related to B, CandD. • On the tree, A is only close to B A B D B MDS 2 C A C D MDS 1
The difference between multidimensional scaling and trees • Ais equally related to B, CandD. • On the tree, A is only close to B • This can make it hard to visualize categorical clusters A B D B MDS 2 C A C D MDS 1
The difference between multidimensional scaling and trees • Ais equally related to B, CandD. • On the tree, A is only close to B • This can make it hard to visualize categorical clusters • The MDS plot solves this problem by putting closely related species close together on the page A B D B MDS 2 C A C D MDS 1
MDS - multidimensional scaling Site Hopkins Pt.Lobos
MDS - multidimensional scaling Source Water Anemone
MDS - multidimensional scaling Year 2007 2008 2009 2010 2011 2012
Protein similarity network: ompK Pairwise BLAST of all sequences Similarity network not based on alignment Analysis with Cytoscape software Used to represent similarity network between protein families
Structure Analyzes population structure based on distances at multiple loci for each genotype provided. K = 6 K = 4 K = 4, Sorted by year. 2007 2008 2009 2010 2011 2012
Structure K = 4 K = 4, Sorted by location. Hopkins Point Lobos
Main Questions • Is there an observable population structure? If so, how does this relate to the ecology of the environment? (Null) Hypothesis: No structure/pattern. “Everything is everywhere.” Finding: There is population structure. Difficult to tell what factors determine it.
Main Questions • Is there an observable population structure? If so, how does this relate to the ecology of the environment? (Null) Hypothesis: No structure/pattern. “Everything is everywhere.” • How is evolution occurring in the environment? • Rate: Are the genes evolving at a similar rate? Hypothesis:Housekeeping genes areevolving slower than OmpK. • Process: How much recombination vs. mutation is observed in populations? Hypothesis: Significant amounts of both recombination and mutation at a fixed ratio within the population.
Do genes evolve at similar rates? recA mdh mdh recA gyrB gyrB
But ompK is different… ompK gyrB
Positive vs. Negative Selection within Genes: dN/dS recA gyrB