How many markers are necessary to infer correct familial relationships in follow-up studies?

How many markers are necessary to infer correct familial relationships in follow-up studies? Silvano Presciuttini1,3, Chiara Toni2, Fabio Marroni1, Isabella Spinetti2, Ranieri Domenici2, and Joan E. Bailey-Wilson3 1Center of Statistical Genetics, University of Pisa, Italy 2Unit of Legal Medicine, University of Pisa, Italy 3Inherited Disease Research Branch, NHGRI, NIH, Baltimore, USA

Background of this study • The inference about the biological relationship between pairs of individuals using genetic markers plays a central role in many areas of human and applied genetics. • In forensic science,deficiency paternity cases arise when the alleged parent of a claimant is not available; often these cases reduce to determine the likelihood ratio of two alternative hypotheses about the relationship between a single pair of individuals. • In follow-up linkage studies, genome-wide scan data are usually not available, and verifying relationships among relatives could only be obtained by genotyping of additional markers outside the candidate region. We investigated the number of unlinked STR markers necessary to infer the true relationship between a pair of individuals, against several alternative hypotheses, using computer simulations.

Design of the study • We focused on the number of markers (M) necessary to reach predefined levels of power in discriminating relationships (1-b = 80%, 90%, 95%, and 99%) at various significance levels (a = 5%, 1%, and 0.1%) • The following relationships were considered: 1) parent-child (PC); 2) full sibs (FS), 3) second degree (2D, including half-sibs, grandparent-grandchild and avuncular pairs), 4) first cousins (FC), and non-relatives (NR) • We investigated: • The “exact” method for inferring relationships • An approximate method based on IBS allele sharing • The reduction in the number of total genotypes achieved by using a sequential test

The exact method of inferring relationships The usual, long-established method of inferring relationships between individuals is based on the population frequencies of the observed alleles and on the conditional probabilities of the observed genotypes, given any two alternative hypothesized relationships Probabilities of the seven possible combinations of genotypes for five common relationships

Computer simulations • For each comparison, we simulated the genotypes of 10,000 pairs of relatives of the five considered relationships,and another set of 10,000 pairs of the “false” relationship to be tested. • Simulations were carried out separately for 25 commonly used markers(including the 13 CODIS markers and a second set of 12 markers commonly used in the forensic practice), and they were repeated a second time to reach a total number of 50 markers, thus representing a possible future expansion of validated markers. • The LR formula appropriate for each true relationship was applied to all 20,000 simulated pairs (including true and false relatives) for an increasing number of markers (1 to 50), and the resulting distributions of the log(LR) was analyzed in terms of a standard power analysis

Discriminating full sibs from half sibs As an example of applying the the exact method, we show the power in discriminating full sibs from 2nd degree relationships Figure: number of markers needed to reach a given false-positive ratio (a) at various percentages of true positives (1-b) in discriminating 2D relatives from FS

Results of the exact method Number of markers required to discriminate true relatives from false relatives at various combinations of a and 1-b

The IBS method • Instead of calculating a LR based on the genotypes of the individuals in a pair, it is possible to calculate a LR based on the number of alleles (0, 1, or 2) that the pair share identical by state. • We have previously shown that the probabilities of sharing 0, 1, or 2 alleles (z0, z1, and z2,) for a given relationship depend on locus heterozygosity (H), and are scarcely affected by variation of the distribution of allele frequencies. • This allowed us to obtain empirical curves relating zi’s to H for a series of common relationships. • This means that the LR of a pair of relationships between any two individuals, given their genotypes at a locus, is a function of a single parameter, H.

Relationship between H and zi The figure shows a plot of the exact probabilities of sharing 0, 1 or both alleles at 19 loci as a function of locusheterozygosity for three common relationships (full sibs, 2nd degree and non-relatives). Lines represent third-order polynomialregression curves.

Power comparison of the exactand the IBS methods When reliable allele frequency estimates are not available, the IBS method can be applied without losing much power Figure: Number of markers needed in the IBS method and in the exact method to discriminate full sibs from 2nd degree relatives

Usefulness of a sequential test When genotyping is specifically performed for the purpose of verifying relationships, a sequential test may save a lot of resources TABLE: Mean number of markers per individuals needed to discriminate full sibs from 2nd degree relatives in a sequential test a = 0.05

Conclusions • Verifying reported relationships in follow-up familial studies is often based on typing of a small number of markers in new families • Use of 15-50 unlinked markers of the type used in forensic science provide sufficient power to discriminate the most critical relationships • When no reliable estimate of allele frequencies is available, a method based on IBS may be used instead of the conventional exact method • A sequential test based on small sets of markers to be added at each step may render the task of verifying relationships highly efficient

How many markers are necessary to infer correct familial relationships in follow-up studies?

How many markers are necessary to infer correct familial relationships in follow-up studies?

Presentation Transcript

Descriptive Research

Biochemical markers in disease diagnosis

2.1 Organisms and their Relationships

Tumor Markers

24 Weather Markers

Tumor Markers

Historical Markers

VICON NEXUS

Cardiac Markers

Social Studies

Familial Hypercholesterolemia

Familial Cancer Syndromes

Discourse Markers

Identifying Relationships

Tumor markers

Genome-wide Association Studies

Relationships