290 likes | 404 Views
Other stuff about DNA profile evidence . David Balding Imperial College London. Coancestry correction. RMP calculation assumes no relatedness doesn’t exist in reality assumption always unfair to defendants solution is available: coancestry coefficient θ or F ST
E N D
Other stuff about DNA profile evidence David Balding Imperial College London
Coancestry correction • RMP calculation assumes no relatedness • doesn’t exist in reality • assumption always unfair to defendants • solution is available: • coancestry coefficient θ or FST • value of θ varies over alternative possible culprits; in large well-mixed populations it will be <3% for almost all • don’t use average value of θ: those with highest values contribute most to match probability
Coancestry values Suspect (call him s) θ measures average amount of shared ancestry of each alternative culprit with s. θ = 3% θ = 2% θ = 2% θ = 1.5% θ = 1% θ = 1% θ = 1% θ = 0.5% θ = 0.5%
Uniqueness • No matter how strong is DNA evidence, it can be offset by exculpatory evidence • Uniqueness → guilt, so P(U) < P(G) • In fact 1 - ΣRX > 1 - 2ΣRX if wX ≤ 1 P(U) > 1 + ΣRXwX • More details in Sci & Just (1999): When can a DNA profile be regarded as unique?
Uniqueness • Need to consider relatives of defendant, both close and distant • Plausible upper bounds: 10 siblings,…, 107 unrelated (θ = 2%) • Match at 11 STR loci usually suffices to have P(U)>99.9% • BUT: • assumes no error • how high do we need P(U) to claim “unique”? • relies on wX ≤ 1, not in domain of scientist
Familial searches • If no complete matches in database, may be near-matches suggestive of relatedness between offender and member(s) of database • most useful for parent/child or full sibs • when is this “suggestion” strong enough to justify further investigation? • answer uses likelihood ratio
LR for sibs LR(sib vs unrelated) = Data = offender profile & near-match profile Pr(data | sibs) Pr(data | unrelated) For more details and discussion see Paoletti, Doom, Raymer & Krane (Jurimetrics 2006). Also Marjan & Sjerps (1999)
Short-cut formula for relatedness LR at single locus: LR(related vs unrelated) = k2 + k1x LR(paternity) + k0x LR(unrelated) where ki = Prob share i alleles ibd. For sibs, k0 = ¼, k1 = ½ , k2 = ¼. • should include θ correction in LR • should also consider LR(sibs vs half-sibs) and LR(sibs vs cousins) etc • full analysis: add up prior probability for each possible relationship x LR for that relationship
DI Offspring • Problem • 13 individuals, 10 are the offspring of donor insemination, 3 are putative natural offspring of known donors. • DI offspring conceived at the London clinic of a pioneer of human artificial insemination. No records available. • Question: who, if any of the individuals, is related via fathers to whom?
Data MSENBMWM BS JS DG BE WE MA JW EV NI PM MM SF SH MF • Genotypes consist of 9 to 17 STR loci.
JW, EV and SF were all thought to be the offspring of the same, known donor • we ignored this information • Canadian company typed JS, BS and DG at 3 RFLP loci and 13 STRs; confirmed full-sib relationship of JS and BS; half-sibs of DG • we reanalysed the STR data + additional STRs • Further testing in California: • JW not related to JS and BS • Typing in London • NI and SF have common father
Use of LR infeasible for many, complex pedigrees ─ easier for pairs, so start with that • Hypotheses (not exhaustive): HS = two individuals are (paternal) half-siblings; UN = unrelated. • Under HS there are two equally likely cases, • paternal allele shared ibd • paternal alleles not ibd. • LR for a single locus: (1)
Pairwise LR with mothers’ genotypes • Mothers’ genotypes, where available, help to identify the paternal allele in the offspring. • After some manipulation we can obtain: • Assumes mother’s genotypes independent – not strictly true. • 0nly 4 distinct forms for LR
Mayor & Balding: For Sci Int 06. • Reliable inferences in the absence of maternal genotypes requires many more than the 10 – 25 loci routinely used. • Inclusion of maternal genotypes more than halves the number of loci required (~22 with mothers, ~50 without). • More power to discriminate half-sibs than profiling the same number of additional loci in the offspring alone.
Pairwise LRs: reference alleles from UK Donorlink/LGC Includes maternal data where available BS SF JS DG LR > 100 LR > 50 MM BE MF SH NI WE JW EV PM
Trio LRs • With pairs can never make an exclusion. • with trios can exclude a common father for all three individuals (no mutation). • LR for trios: compare one father vs three fathers • very clearly not exhaustive hypotheses. • Here, ibd => all the individuals share a paternal allele. • Also need corresponding LR when mothers available.
Trio LRs With maternal information where present MM SF JS LR > 10 000 LR > 1000 MF BE BS DG SH JW EV PM WE
Trio LRs With maternal information where present MM SF JS LR > 10 000 MF BE BS DG SH JW EV PM WE
Familias math.chalmers.se/~mostad/familias/ • Results of pairwise and trio LR allow us to reduce the number of possible pedigrees to 26. • Familias – software that determines the most probable pedigree given genotype information (Egeland et al, 2000).
Familias Pedigrees Probability = 0.0003, mutation rate = 0.001
Familias Pedigrees Probability = 0.1710, mutation rate = 0.001
Familias Pedigrees Probability = 0.8287, mutation rate = 0.001
Linkage • Increase in number of loci used means some loci are going to be linked – tend to be co-inherited. • We have also investigated the effect of linkage on the classification of half-sibs. • Can locate 60 loci genome-wide with no spacing < 50 cM. At this level effect of linkage is modest and we have neglected it
Low copy number: partial profiles • Crime scene profile = A; Suspect s is AB; • normally exclusion, but could be that s is the donor of the crime scene sample and B allele suffered “drop-out”. • Similarly, the true source of the crime stain could have any genotype that includes an A allele.
pA2 + 2 pA (1 – DA) Σx≠A pxDx = LR(different sources vs same source) (1 – DA) DB DX = drop-out probability for allele X (under the conditions to which the stain was exposed). If D is the same for all X then 1 – 2D(1-D) LR = pA2 + 2pA (1 – D)D LR >> 1 (different sources) if D small or D large
Recent Old Bailey case • Lots of victim DNA, 17 STR alleles at 10 loci. • Minute trace of offender (?) DNA, 8 alleles not masked by victim alleles or artefacts. • Defendant profile has 11 alleles not masked. Includes all 8 minor component alleles. • Qn: how strong is evidence?? What to do about 3 alleles that should have been there under prosecution hypothesis: • trace peak in each position, not to reportable standards • 1 in stutter position adjacent to homozygote peak • 2 at HMW positions, more susceptible to dropout ?
Forensic Science Service has no written procedures for dealing with this situation (“missing alleles”). • In effect analysis assumed LR=1 at missing loci (so “neutral”, same as if the three loci were never tested) • I argued that this can’t be right in principle: alleles not there that should be under prosecution case may favour defendant.
Conclusion: What to say to jurors? • No mention of “random man”, “random match probability” etc • Jurors’ task that expert can help with is to assess if there are alternative possible offenders with same profile • real people with names, none of them is “random” • some are relatives of defendant • all have shared ancestry with defendant at some level • Give expected number(s) of matches