1 / 20

The birthday problem and database searches: The implications of relatives in offender databases

The birthday problem and database searches: The implications of relatives in offender databases. Jason R. Gilder 8/12/2006. The birthday problem. What is the probability of someone randomly picked having your birthday?

louanna
Download Presentation

The birthday problem and database searches: The implications of relatives in offender databases

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The birthday problem and database searches: The implications of relatives in offender databases Jason R. Gilder 8/12/2006

  2. The birthday problem • What is the probability of someone randomly picked having your birthday? • If births are evenly distributed probability of picking someone with your birthday is: • 1 in 365 • Similar to the random match probability

  3. The birthday paradox • What if we change the question to How many people do we need to search before we find two people with the same birthday? • Many more comparisons • Will find a match much more quickly • Greater than a 50% chance of finding a match with 23 individuals

  4. Combinatorial calculations • How many pairs are present with n individuals? • 23 individuals => 253 pairs

  5. Birthday paradox plot

  6. Database searches • Birthday paradox is similar to a database search • Arizona performed a complete pairwise search of their DNA database containing 65,493 profiles • 144 pairs of individuals matching at 9+ loci • Related individuals likely exist in DNA databases

  7. Simulation studies • Use FBI published Caucasian genotypes • Generate database of randomized (unrelated) individuals • Create and add pairs of related individuals • Siblings • Parent-child • Half-siblings • Cousins

  8. Unrelated individuals Here, the database of 65k has 109 pairs of individuals matching at 9+ loci

  9. The size of the database vs. the number of matching profiles by alleles • Here, the database of 65k has 139 pairs of individuals matching at 21+ alleles • Florida’s threshold for a familial search

  10. Effect of adding siblings to a database of 10,000 individuals Additional 35 matches at 9+ loci with ~1000 sibling pairs

  11. The number of pairs of siblings vs. the number of matching profiles by alleles Database of 10,000 individuals

  12. Regression analysis

  13. The number of 9+ locus matching profiles within databases containing different sibling ratios

  14. The number of 21+ allele matching profiles within databases containing different sibling ratios

  15. Effect of different degrees of related individuals (9+ locus matches) Databases contain 10% related individuals Results are averaged over 5 replicates of the database (average, standard deviation)

  16. Effect of different degrees of related individuals (21+ allele matches) Databases contain 10% related individuals Results are averaged over 5 replicates of the database (average, standard deviation)

  17. Theoretical model • Current work is to develop mathematical model to estimate number of sibling pairs found in a database • Use repeat rate to determine probability of siblings sharing same genotype • Identity by descent and identity by state

  18. Estimating number of sib pairs • Number of sibling pairs needed to find an 11+ locus match • At least 5% chance: 460 pairs • Experimentally, we found an 11+ locus match with a database of 10,000 individuals and 500 pairs of siblings

  19. Estimating number of sib pairs • Number of sibling pairs needed to find an 12+ locus match • At least 5% chance: 696 pairs • Experimentally, we found one 12+ locus match with a database of 10,000 individuals and 1,000 pairs of siblings

  20. Questions? Jason Gilder Forensic Bioinformatics www.bioforensics.com

More Related