1 / 29

Deconvolution of mixed DNA samples

Deconvolution of mixed DNA samples . Michael L. Raymer, Ph.D. Dept. of Computer Science & Engineering Wright State University. Introduction. Mixture interpretation is not trivial. Difficult tasks include: Determining the number of possible contributors

isanne
Download Presentation

Deconvolution of mixed DNA samples

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Deconvolution of mixed DNA samples Michael L. Raymer, Ph.D. Dept. of Computer Science & Engineering Wright State University

  2. Introduction • Mixture interpretation is not trivial. • Difficult tasks include: • Determining the number of possible contributors • Determining the genotypes (or possible genotypes) of each contributor Doom/Raymer

  3. Introduction to Mixtures • Determining if two genotypes could be contributors is relatively easy Possible contributors to a mixture: D3 locus genotype Individual #1: 15, 18 Individual #2: 14, 18 Mixture: 14, 15, 18 • But beware – the opposite is not true Doom/Raymer

  4. Introduction to Mixtures • Mixtures can exhibit up to two peaks per contributor at any given locus • Mixtures can exhibit as few as 1 peak at any given locus (regardless of the number of contributors) Doom/Raymer

  5. Introductionto Mixtures • Even determining the number of contributors is non-trivial D3 locus genotype Mixture: 14, 15, 18 Another Option Individual C: 15, 15 Individual D: 14, 15 Individual E: 18, 18 • There is no “hard” mathematical upper bound to the number of contributors possible Doom/Raymer

  6. Introductionto Mixtures • Determining what genotypes created the mixture is non-trivial D3 locus genotype Mixture: 14, 15, 18 Option #1Option #3 Individual A: 15, 18 Individual #D: 14, 15 Individual B: 14, 18 Individual #E: 18, 18 Option #2Option #4 Individual B: 14, 18 Individual #A: 15, 18 Individual C: 15, 15 Individual #F: 14, 14 Doom/Raymer

  7. Introductionto Mixtures • Often the victim’s genotype is known, but this does not always make the defendant’s genotype clear D3 locus genotype Mixture: 14, 15, 18 Victim: 14, 15 Possible genotypes for a single perpetrator: Individual C: 14, 18 Individual D: 15, 18 Individual E: 18, 18 Individual F: 14, 14 ? Doom/Raymer

  8. Making sense of mixtures • How many contributors were there? • What are the genotypes of each contributor? Doom/Raymer

  9. Number of contributors • D. Paoletti, T. Doom, C. Krane, M. Raymer, and D. Krane (2005) “Empirical Analysis of the STR Profiles Resulting from Conceptual Mixtures”, Journal of Forensic Sciences, Vol. 50, No. 6, Nov. 2005. • 13-loci genotypes from 959 individuals from six racial groups • Combine into hypothetical mixtures • 146,536,159 three-person mixtures considered Doom/Raymer

  10. All 3-way FBI mixtures Maximum # of alleles observed # of occurrences As Percent 2 0 0.00% 3 78 0.00% 4 4,967,034 3.39% 5 93,037,010 63.49% 6 48,532,037 33.12% • 3.4% of three contributors mixtures “look like” two contributors Doom/Raymer

  11. 4-way FBI mixtures Maximum # of alleles observed # of occurrences As Percent 1, 2, 3 0 0.00% 4 13,480 0.02% 5 8,596,320 15.03% 6 35,068,040 61.30% 7 12,637,101 22.09% 8 896,435 1.57% • ~76% of four contributors mixtures “look like” three contributors Doom/Raymer

  12. Contrib 1Contrib 2 16,16 17,17 16,17 16,17 16,16 16,17 17,17 16,17 Mixture Deconvolution • Even when the number of contributorsis known (or assumed), separating mixtures into their components can be difficult Doom/Raymer

  13. High peak avg. Low peak avg. Simple example: All loci heterozygous, two contributors Current Methods • Most methods start by inferring themixture ratio: Doom/Raymer

  14. Minimal Basic Assumptions • A primary assumption of all methods is peak additivity • Consistency: Most labs assume peaks from the same source will vary by  30% Doom/Raymer

  15. Evidence of consistency Relationship between the smaller and larger peaks in heterozygous loci of reference samples. Doom/Raymer

  16. Homozygous peak additivity • y = 1.865x + 208.5 => Hom = 1.9 * Het • R2 = 0.845 Doom/Raymer

  17. Objectives • Start with simple assumptions: • Additivity with constant variance: c • Peaks below a minimum threshold (often 50 or 150 RFU) are not observable • Peaks above the saturation threshold (often 4000 RFU) are not measurable • Obtain provably correct deconvolution where possible • Identify when this is not possible Doom/Raymer

  18. Method • Assume the number of contributors • Enumerate all possible mixture contributor combinations • Determine which pairs of profiles contain peaks in balance Doom/Raymer

  19. 2080 RFU 2030 RFU 210 RFU 180 RFU P3 P4 P1 P2 Peak Balance • Example: assume two contributors, four peaks: • For this locus, and c = 1.3, the combination (P1,P3) is out of balance because: Peaks are numbered by height Doom/Raymer

  20. Example: Mixture of four peaks • P4 >= P3 >= P2 >= P1 >= Min. Threshold Doom/Raymer

  21. Sweet Spot • If only one row is satisfied, then the genotypes can be unambiguously and provably determined Doom/Raymer

  22. 2080 RFU 2030 RFU 210 RFU 180 RFU P3 P4 P1 P2 Example: In the sweet spot • P4 > cP2so we can’t have (P4,P2) • P4 > cP1so we can’t have (P4, P1) Doom/Raymer

  23. 245 RFU 230 RFU 190 RFU 180 RFU P3 P4 P1 P2 Example: Ambiguous Locus • P2 is within c of both P1 and P4, so we can have • (P1,P3) (P2,P4), or • (P1,P2) (P3,P4) • P4 cannot pair with P1 Doom/Raymer

  24. 700 RFU 500 RFU 300 RFU 200 RFU P3 P4 P1 P2 Example: No row satisfied • P4 (for example) cannot pair with any other peak • One of our assumptions (c or the number of contributors) is incorrect Doom/Raymer

  25. Dealing with homozygotes • If we assume two contributors and there are only three peaks, then one of several possibilities exists: • One contributor is homozygous • One of the peaks is there, but not observable because it is below the minimum threshold • The contributors share an allele at this locus Doom/Raymer

  26. Three Peaks Doom/Raymer

  27. Advantages of the method • If you accept the simple assumptions, the resulting mixture interpretations directly follow • Interprets mixtures on a locus by locus basis • Does not interpret ambiguous loci Doom/Raymer

  28. Future work • Mixture ratio can be inferred only from unambiguous loci, and then applied to perform an more aggressive interpretation of the ambiguous loci when desired • Confidence values can be applied to the more aggressively interpreted possitions Doom/Raymer

  29. Acknowledgements • Research Students • David Paoletti (analysis of allele sharing) • Jason Gilder (data collection, additivity study, mixture deconvolution) • Faculty • Travis Doom • Dan Krane • Michael Raymer Doom/Raymer

More Related