980 likes | 999 Views
Explore the advantages of CODIS and SNP genotyping in record linkage, addressing privacy concerns and backward compatibility. Learn about the implications and benefits of combining SNP and CODIS data for improved accuracy and linkage.
E N D
Record linkage of CODIS profiles with SNP genotypes Doc Edge February 20th, 2018 NIJ R&D seminar
Acknowledgments Other Collaborators: Jaehee Kim Jun Li Discussion: ArbelHarpak Rosenberg Lab Funding: National Institute of Justice Stanford Graduate Fellowship Bridget Algee-Hewitt Noah Rosenberg
Overview CODIS
Overview CODIS The rest of the genome
Overview CODIS The rest of the genome
Overview CODIS The rest of the genome
Advantages of CODIS • Highly diverse, so low false match probability.
Advantages of CODIS • Highly diverse, so low false match probability. • Possible to get (nearly) unique identifiers with ‘90s-era technology.
Disadvantages of STR genotyping • Few markers mean that it’s hard to • Resolve mixtures • Resolve familial relationships
Disadvantages of STR genotyping • Few markers mean that it’s hard to • Resolve mixtures • Resolve familial relationships • SNP genotyping is now cheaper and more informative (100s of thousands of markers)
Major obstacle to switching to SNPs • Backward compatibility
Major obstacle to switching to SNPs • Backward compatibility • Record linkage could solve this problem
Another reason to be interested • Privacy / Procedure
Another reason to be interested • Privacy / Procedure • “The CODIS loci…do not reveal the genetic traits of the arrestee.” – Maryland v. King (2013)
Implications of linking Privacy concerns
Implications of linking Privacy concerns Backward compatibility
Implications of linking Privacy concerns Backward compatibility Phenotype prediction
Recombination and linkage Descendants Ancestors Adapted from Li & Jiang, 2005
Recombination and linkage Descendants Ancestors Adapted from Li & Jiang, 2005
Recombination and linkage Descendants Ancestors Adapted from Li & Jiang, 2005
Recombination and linkage Descendants Ancestors Adapted from Li & Jiang, 2005
Recombination and linkage Descendants Ancestors Adapted from Li & Jiang, 2005
Imputation works for SNPs Adapted from Edge, Goroochurn, & Rosenberg, 2013, Figure 4
Imputation works for SNPs GATTACA GATTACA GATTACA GATTACA GATTACA GATTACA Adapted from Edge, Goroochurn, & Rosenberg, 2013, Figure 4
Imputation works for SNPs GATTACA TAGACAT GATTACA GATTACA TAGACAT GATTACA GATTACA GATTACA TAGACAT TAGACAT Adapted from Edge, Goroochurn, & Rosenberg, 2013, Figure 4
Imputation works for SNPs GATTACA TAGACAT GATTACA GATTACA TAGACAT GATTACA GATTACA GATTACA TAGACAT TAGACAT GATTACA TAGACAT Reference Panel Adapted from Edge, Goroochurn, & Rosenberg, 2013, Figure 4
Imputation works for SNPs GATTACA TAGACAT GATTACA GATTACA TAGACAT GATTACA GATTACA GATTACA TAGACAT TAGACAT ??T???? Study Sample ??T???? ??G???? GATTACA TAGACAT Reference Panel Adapted from Edge, Goroochurn, & Rosenberg, 2013, Figure 4
Imputation works for SNPs GATTACA TAGACAT GATTACA GATTACA TAGACAT GATTACA GATTACA GATTACA TAGACAT TAGACAT GATTACA Study Sample GATTACA TAGACAT GATTACA TAGACAT Reference Panel Adapted from Edge, Goroochurn, & Rosenberg, 2013, Figure 4
Question 1 • Is imputation possible in STRs?
Our Data (N = 978) Human Genome Diversity Project (HGDP) Image: NA Rosenberg, 2011
Imputation for human STRs • Beagle 4.1 imputes multi-allelic markers using nearby SNPs. • Assess imputation accuracy (654 in training set, 218 in test set).
Question 2 • Can record linkage be performed by combining imputation information across sites?
A data linkage view of the problem SNP Genotypes
A data linkage view of the problem SNP Genotypes CODIS genotypes at each locus
A data linkage view of the problem SNP Genotypes CODIS genotypes at each locus Can we link SNP records with CODIS records?
Fellegi & Sunter (1969) for CODIS SNP haplotype
Fellegi & Sunter (1969) for CODIS SNP haplotype CODIS allele
Fellegi & Sunter (1969) for CODIS SNP haplotype CODIS allele Prob. of observing pair given NO MATCH
Fellegi & Sunter (1969) for CODIS SNP haplotype CODIS allele Prob. of observing pair given NO MATCH
Fellegi & Sunter (1969) for CODIS SNP haplotype CODIS allele Prob. of observing pair given NO MATCH
Fellegi & Sunter (1969) for CODIS SNP haplotype CODIS allele Prob. of observing pair given NO MATCH Prob. of observing pair given MATCH:
Fellegi & Sunter (1969) for CODIS SNP haplotype CODIS allele Prob. of observing pair given NO MATCH Prob. of observing pair given MATCH:
Fellegi & Sunter (1969) for CODIS SNP haplotype CODIS allele Prob. of observing pair given NO MATCH Prob. of observing pair given MATCH:
Fellegi & Sunter (1969) for CODIS SNP haplotype CODIS allele Prob. of observing pair given NO MATCH Prob. of observing pair given MATCH: