1 / 22

Improving Forensic Identification in Bayesian Networks : Accounting for Population Substructure

Improving Forensic Identification in Bayesian Networks : Accounting for Population Substructure. Amanda B. Hepler. Outline. Population Substructure (PS) Bayesian Networks Introduction Incorporating PS into Paternity Networks Example. What Is Population Substructure?.

LionelDale
Download Presentation

Improving Forensic Identification in Bayesian Networks : Accounting for Population Substructure

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Improving Forensic Identification in Bayesian Networks :Accounting for Population Substructure Amanda B. Hepler

  2. Outline • Population Substructure (PS) • Bayesian Networks • Introduction • Incorporating PS into Paternity Networks • Example

  3. What Is Population Substructure? • Any deviation from random mating • Commonly due to geographical subdivision • Mating pairs often have remote relatives in common • Inbreeding coefficient (): - measures the extent of common ancestry

  4. Why Should We Account for PS? • Ignoring PS “would unfairly overstate the strength of the evidence against the defendant.” (Balding & Nichols, 1995) • “If the allele frequencies for the subgroup are not available…[forensic] calculations should use the population-structure equations.” (1996 NRC Report)

  5. Assumptions • Population allele frequencies are known • Inbreeding coefficient is known • Loci are independent • Within a subpopulation: • Mating is random • Migration and mutation events independent and constant

  6. Graphical Portion Hair Color Eye Color Red 0.5 Hair Color: Red Brown Numerical Portion Brown 0.5 Blue 0.2 0.9 Green 0.8 0.1 What is a Bayesian Network (BN)? • A graphical model that expresses probabilistic relationships among variables or events1 • HUGIN used to create BNs, free version available at http://www.hugin.dk

  7. Why Use Bayesian Networks? BNs provide: • Simple representations of complex problems • Automation of complex algebraic manipulations • Communication aide

  8. Notation for Paternity Case • M = mother, C = child, PF = putative father • Hp: PF is the father of CHd: Some other man is the father of C • Likelihood ratio, or paternity index (PI): • Interpretation: “The evidence is PI times more probable if PF is the father of C than if some other man is the father.”

  9. Genotype Nodes(A1A1, A1A2, A2A2) Mother Putative Father Allele/Gene Nodes(A1, A2) Child Genotype and Allele Nodes • One locus, two alleles: A1and A2 • Observe genotypes of M, C, and PF

  10. PF’s Maternal Gene PF’s Paternal Gene PF’s Genotype Probability Tables for Genotype and Allele Nodes

  11. Hypothesis Node (Yes, No) Original Paternity Network2 • A.P. Dawid, J. Mortera, V.L. Pascali, and D. Van Boxel. Probabilistic expert systems for forensic inference from genetic markers. Scandinavian Journal of Statistics, 29:577-595. 2002.

  12. Accounting for Population Substructure • Probability of allele Aidepends on how many Aialready observed • Modified allele frequencies3: • pi = frequency of the ith allele in the pop’n • ni = number of observed alleles of type Ai • n = total number of alleles observed • D.J. Balding and R.A. Nichols. DNA profile match probability calculation. Forensic Science International, 64(2-3):125-140, 1994.

  13. New Network Nodes •  : p1: • Keep track of founder genes: • Counting nodes: is the value of n1 after founder 2 is the value of n1 after founder 3, etc.

  14. Population Substructure Network

  15. Population Substructure Network

  16. Population Substructure Network

  17. New Probability Table now depends on

  18. Paternity Calculations By Hand • θ = 0.03, p1 = 0.10 • M = A1A1, C = A1A1, PF = A1A2 • PIfor this case4: • I.W. Evett and B.S. Weir. Interpreting DNA Evidence. Sinauer, Sunderland,MA., 1998.

  19. Paternity Calculations Using HUGIN This same result can be obtained using HUGIN:

  20. Effect of Introducing θ • Assume no population substructure (θ = 0): • 2.91 more “conservative” than 5.00

  21. Other Examples Considered • Multiple loci case: • Assume loci independent • Multiply PI • Multiple Allele Case: • M and PF have at most four distinct alleles • Missing Father Case: • Brother’s genotype available

  22. Areas for Future Research • Apply same methodology to other BNs: • Mutation • Cross-Transfer Evidence • Mixtures • Remains Identification • Software improvements • Need software for the forensic scientist • Improvements needed for run time

More Related