220 likes | 366 Views
Observed and expected numbers of (partially) randomly matching profiles in the Dutch DNA database, and in international DNA searches. Marjan Sjerps Kees van der Beek Ate Kloosterman. The Dutch DNA offender database. National DNA database of the Netherlands at August 2009: 86,929 persons
E N D
Observed and expected numbers of (partially) randomly matching profiles in the Dutch DNA database, and in international DNA searches Marjan Sjerps Kees van der Beek Ate Kloosterman
The Dutch DNA offender database • National DNA database of the Netherlands at August 2009: • 86,929 persons • 40,170 stains • 19,876 hits (stain-person) • 4,628 hits (stain-stain) • Numbers are updated every month at www.DNAsporen.nl • We have used the offender database to empirically test our random match probability (RMP) calculations (adventitious) matches in databases
How good are the RMP estimates we report? • Method is given in : • Weir (2004) Matching and partially matching DNA profiles, J Forensic Sci, Vol. 49, 1009-1014 • Weir (2007) The rarity of DNA profiles, The Annals of Applied Statistics Vol. 1, No. 2, 358–370 • Compare two 10-locus DNA profiles • Full match: at 10 loci both alleles match • Partial match: e.g. at 8 loci both alleles match but at 2 loci only 1 allele matches • Compare the expected number of partial matches to the observed number in a large database (adventitious) matches in databases
Applying Weir (2004, 2007) to NL data • Database contains 73,895 DNA profiles from suspects/offenders at January 16 2009 • Database pre-processing: • 773 matches were found (aliasses and duplicates?); only single copy retained • 1578 partial profiles were removed: only full SGM+ profiles (10 loci) retained; • 71,544 10-locus profiles of different persons left • Consider all pairs of profiles (>2.5 billion pairs) • Observe number of pairs matching at e.g. 8 loci and compare to expected number (adventitious) matches in databases
Observed versus expected numbers of partially matching profiles (adventitious) matches in databases
Observed versus expected numbers of partially matching profiles Relatives? (adventitious) matches in databases
How representative are reference databases? • We compared allele frequencies of offender database (n=71,544) to reference database (n=231) of Dutch Caucasians (adventitious) matches in databases
Allele frequencies: offender & reference database (adventitious) matches in databases
Allele frequencies: offender & reference database (adventitious) matches in databases
Conclusion 1 • Investigating offender databases provides important empirical information about the validity of our RMP estimates • The Dutch data provide: • empirical support for the RMPs that are routinely reported (theta=0.01) • empirical support that theta is close to 0 • empirical support for the assumption that our reference dataset of Dutch Caucasians is sufficiently representative (adventitious) matches in databases
International database search • The Netherlands searches the databases of Germany, Austria, Slovenia, Luxembourg and Spain every day for matches with stains (Prüm treaty) • Other European databases will be available in the future • These searches produce huge numbers of pairwise comparisons of DNA profiles • Two kinds of DNA matches: • “assisting” matches : the matching profiles are indeed from the same person and hence support the investigation • “adventitious” matches: the matching profiles are from two different persons (adventitious) matches in databases
Two examples of international search • We report on data of two searches: • The search performed when starting the exchange with Germany (July 2008) • A search in the UK database with a selection of crime stains (February 2008) (adventitious) matches in databases
NL-DE exchange: 6 and 7 locus matches • The Netherlands has searched the German DNA-database (524,782 persons and 123,862 stains) with 25,249 DNA-profiles from stains • 16 billion (1.6 x 1010) pairs of profiles were compared • Most of the comparisons have 7 loci in common, sometimes 6 loci (adventitious) matches in databases
NL-DE exchange: 6 and 7 locus matches • Expected nr of “assisting” matches = Observed - expected adventitious = 1151 • But we expect about 81 adventitious matches • Therefore, as standard procedure matches are upgraded by typing more loci (SE33 or SGM+) before any personal data are exchanged (following recommendation 5 in ENFSI report) • Practical difficulty: Germany has the policy of immediate destruction of DNA reference samples after analysis, which precludes any additional testing of the reference samples (adventitious) matches in databases
NL-UK exchange:6 and 10 locus matches • The Netherlands has searched the UK DNA-database (4.8 million reference profiles) with 2159 DNA-profiles from stains of serious unsolved crimes • Some of the NL stain profiles were SGM typed (6 loci) (adventitious) matches in databases
NL-UK exchange:6 and 10 locus matches • 28 SGM matches (6 loci) were upgraded to SGM+ (10 loci), only 5 still matched • So the other 23 SGM matches were adventitious matches • 5+17=22 SGM+ matches in total; no adventitious matches expected • So about 22 matches can be used to assist unsolved serious cases • Hence, searching with 6 loci in this example was very useful: upgrading only 28 profiles resulted in 5 SGM+ matches • However when not upgraded this kind of search generates lots of adventitious matches (adventitious) matches in databases
Conclusion 2 • Searching with partial profiles is very useful but produces adventitious matches • Therefore upgrading, if possible, is necessary before reporting (rec.5 ENFSI report) • Upgrading not always possible due to practical or legal limitations • Newer kits and upgrading procedures will reduce the number of adventitious matches considerably • Meantime, searching with mixtures and partial profiles in (inter)national databases without upgrading will produce adventitious matches • Database annual reports should report on matches that were reported but later turned out to be adventitious (adventitious) matches in databases
Reporting database matches • Chakraborty and Ge, Forensic Science Communications July 2009: • “Thus it can be reasoned that cold-hit cases in which the suspect is identified in the absence of valid alibis for not having access to the crime scene, a DNA match can and should be quantified by RMP alone without any additional changes” • We have argued (Meester and Sjerps 2003,2004) that RMP alone is misleading (adventitious) matches in databases
ENFSI document on DNA database management ENFSI-recommendation 22 • A DNA-database match report of a crime scene related DNA-profile with a person should be informative and apart from the usual indication of the evidential value of the match (RMP) it should also contain a warning indicating the possibility of finding adventitious matches (as mentioned in recommendation 21) and its implication that the match should be considered together with other information. (adventitious) matches in databases
Box in NL database match reports • …As the number of DNA-profiles in the database increases, also the probability increases of observing a match with a person who is not the stain donor. The profile of this person matches “coincidentally” with the profile of the trace. • …One has to take this into account especially when a DNA-database match is observed involving an incomplete or mixture profile…. • …For assessing the possibility of an adventitious match it is important whether there is other tactical or technical evidence that associate this person with the crime. • …More information is available in “The essentials of forensic DNA testing” [NFI practical professional annex for jurists, also available in English, contact l.meulenbroek@nfi.minjus.nl] (adventitious) matches in databases
Conclusion 3 • It is misleading to report a DNA database match by only mentioning a RMP • Report should include a warning (recommendation 21 and 22 ENFSI report) • NFI and custodian of NL national database include “point of attention”-box in their reports (adventitious) matches in databases