300 likes | 539 Views
Lecture 16: Individual Identity and Paternity Analysis. March 7, 2014. Last Time. Interpretation of F-statistics More on the Structure program Principal Components Analysis Population assignment Individual identity (in lab). Today. Population assignment examples
E N D
Lecture 16: Individual Identity and Paternity Analysis March 7, 2014
Last Time • Interpretation of F-statistics • More on the Structure program • Principal Components Analysis • Population assignment • Individual identity (in lab)
Today • Population assignment examples • More on forensic evidence and individual identity • Introduction to paternity analysis
for homozygote AiAi in population l at locus k for heterozygote AiAj in population l at locus k for m loci Population Assignment: Likelihood • "Assignment Tests" based on allele frequencies in source populations and genetic composition of individuals • Individual likelihood often summarized as: • -log10(P(G|Hn)) • Low numbers mean higher probability of Hn • Likelihoods can be plotted against each other
Carmichael et al. 2001 Mol Ecol 10:2787 Population Assignment Example: Wolf Populations in Northwest Territories • Wolf populations sampled on island and mainland populations in Canadian Northwest Territories • Immigrants detected on mainland (black circles) from Banks Island (white circles) -log likelihood from Mainland -log likelihood from Banks Island
Lake Saimaa Market Population Assignment Example:Fish Stories • Fishing competition on Lake Saimaa in Southeast Finland • Contestant allegedly caught a 5.5 kg salmon, much larger than usual for the lake • Officials compared fish from the lake to fish from local markets (originating from Norway and Baltic sea) • 7 microsatellites • Based on likelihood analysis, fish was purchased rather than caught in lake -
Individual Identity: Likelihood • Assume you find skin cells and blood under fingernails of a murder victim • A hitman for the Sicilian mafia is seen exiting the apartment • You gather DNA evidence from the skin cells and from the suspect • They have identical genotypes • What is the likelihood that the evidence came from the suspect? • What is H1 and what is H2?
Heterozygote Homozygote for m loci Match Probability • Probability of observing a genotype at locus k by chance in population is a function of allele frequencies: • Assumes unlinked (independent loci) and Hardy-Weinberg equilibrium
Homozygotes for m loci Heterozygotes Probability of Identity • Probability 2 randomly selected individuals have same profile at locus k: • Exclusion Probability (E): E=1-P
What if the slimy mob defense attorney argues that the most likely perpetrator is the mob hitman’s brother, who has conveniently “disappeared”? Does the general match probability apply to near relatives?
0 alleles IBD 2 alleles IBD 1 allele IBD Probability of identity for full sibs 2 alleles IBD Homozygotes Heterozygotes 0 alleles IBD General Probability of Identity for Full Sibs:
Probability of identity for full sibs Probability of identity unrelated individuals For a locus with 5 alleles, each at a frequency of 0.2: PID = 0.072 PIDsib= 0.368
Homozygotes Heterozygotes NRC (1996) recommendations • Use population that provides highest probability of observing the genotype (unless other information is known) • Correct homozygous genotypes for substructure within selected population (e.g., Native Americans, hispanics, African Americans, caucasians, Asian Americans) • No correction for heterozygotes
Why is it ‘conservative’ (from the standpoint of proving a match) to ignore substructure for heterozygotes?
Example: World Trade Center Victims • Match victims using DNA collected from toothbrushes, hair brushes, or relatives • Exact matches not guaranteed • Why not? • Use likelihood to match samples to victims
A series of little NBA prospects are born to ardent basketball fans in every city with an NBA team. The mothers regularly allege that the fathers are NBA stars from visiting teams. The “players” deny this allegation. Can this be resolved using molecular markers and population genetics methodologies?
Paternity Exclusion Analysis • Determine multilocus genotypes of all mothers, offspring, and potential fathers • Determine paternal gamete by “subtracting” maternal genotype from that of each offspring. • Infer paternity by comparing the multilocus genotype of all gametes to those of all potential males in the population • Assign paternity if all potential males, except one, can be excluded on the basis of genetic incompatibility with the observed pollen gamete genotype • Unsampled males must be considered
NO NO YES YES YES NO NO YES NO YES YES NO YES YES NO Paternity Exclusion Locus 1 • First step is to determine paternal contribution based on seedling alleles that do not match mother • Notice for locus 3 both alleles match mother, so there are two potential paternal contributions • Male 3 is the putative father because he is the only one that matches paternal contributions at all loci Locus 2 Locus 3
Parentage Analysis: Paternity Exclusion • Determine multilocus genotypes of all mothers, offspring, and potential fathers • Determine paternal gamete by “subtracting” maternal genotype from that of each offspring. • Infer paternity by comparing the multilocus genotype of all gametes to those of all potential males in the population • Assign paternity if all potential males, except one, can be excluded on the basis of genetic incompatibility with the observed pollen gamete genotype • Unsampled males must be considered
Paternity Exclusion Analysis Possible outcomes: Consequences: Female • Only one male cannot be excluded • More than one male cannot be excluded • All males are excluded Male • Paternity is assigned • Analyze more loci • Conclude there is migration from external sources Male Female Male Male Male Female Male Male ?
Probabilities of Paternity Exclusion, Single Locus, 2 alleles, codominant • The paternity exclusion probability is the sum of the probability of all exclusionary combinations Hedrick 2005 Probability of a falsely accused male of not matching for at least one of m loci: See Chakraborty et al. 1988 Genetics 118:527 for a more general calculation of exclusion power
Alleles versus Loci • For a given number of alleles: one locus with many alleles provides more exclusion power than many loci with few alleles • 10 loci, 2 alleles, Pr = 0.875 • 1 locus, 20 alleles, Pr=0.898 • Uniform allele frequencies provide more power
Characteristics of an ideal genetic marker for paternity analysis • Highly polymorphic, (i.e. with many alleles) • Codominant • Reliable • Low cost • Easy to use for genotyping large numbers of individuals • Mendelian or paternal inheritance
Shortcomings of Paternity Exclusion • Requiring exact matches for potential fathers is excessively stringent • Mutation • Genotyping error • Multiple males may match, but probability of match may differ substantially • No built-in way to deal with cryptic gene flow: case when male matches, but unsampled male may also match • Type I error: wrong father assigned paternity)
Advantages and Disadvantages of Likelihood • Advantages: • Flexibility: can be extended in many ways • Compensate for errors in genotyping • Incorporate factors influencing mating success: fecundity, distance, and direction • Compensates for lack of exclusion power • Fractional paternity • Disadvantages • Often results in ambiguous paternities • Difficult to determine proper cutoff for LOD score
Summary • Direct assessment of movement is best way to measure gene flow • Parentage analysis is powerful approach to track movements of mates retrospectively • Paternity exclusion is straightforward to apply but may lack power and is confounded by genotyping error • Likelihood-based approaches can be more flexible, but also provide ambiguous answers when power is lacking