1 / 37

MEASURES OF DEPENDENCE

This article explores the application of copulas in measuring dependence between random variables. It discusses the limitations of linear correlation and introduces alternative measures such as Spearman's rank correlation and Kendall's rank correlation. The role of copulas in calculating these measures is also explained.

caglee
Download Presentation

MEASURES OF DEPENDENCE

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. MEASURES OF DEPENDENCE Application of copulas

  2. 1. Introduction (cont.) • Linear Correlation: • -1≤≤1 • X and Yindependent  (X,Y) = 0. • (aX + b, cY + d) = sgn(ac)(X,Y)

  3. 1. Introduction • Correlation one of the most widely used, and perhaps misunderstood, concept in statistics. • Advantages of linear correlation: • Straightforward to calculate • Easy to manipulate under linear operations • Natural measure of dependence in multivariate elliptical distribution

  4. 1. Introduction (cont.) • Fallacies about linear correlation: • Marginal distributions and correlation coefficients determine the joint distribution. • Given marginals F1 and F2 all linear correlations between -1 and 1 are realizable.

  5. 1. Introduction (cont.) • Shortcomings: • The variances of X and Y must be finite. Not ideal for heavy tailed distributions. • Zero correlation does not imply stochastic independence, except for the case of elliptic distributions. • Not invariant under nonlinear strictly monotonic transformations.

  6. 1. Introduction (cont.) • Given two distributions F1 and F2, the attainable correlation is given by the following theorem: • Let(X,Y) a random variable with marginals F1 and F2 and unspecified dependence structure with finite variances. Then • The set of all possible correlations is a closed interval [min, max] and for the extremal correlations min<0< max. • minis attainable iff X and Y are perfectly negatively dependent (countermonotonic); maxis attainable iff X and Y are perfectly positively dependent (comonotonic). • min = -1if X and –Y are of the same type; max = 1, if X and Y are of the same type.

  7. 1. Introduction (cont.) • Remark: two random variables are said to be of the same type if there exists  > 0 and  in R, such that Y =d X + . Conclusion: Need alternative measures of functional dependence.

  8. 2. Measures of dependence. • Definition 2.1. (concordant, discordant)Observations (x1, y1) and (x2, y2) are concordant if (x1 – x2)(y1 – y2) > 0. They are discordant if (x1 – x2)(y1 – y2) < 0. • Definition 2.2. (Spearman’s rank correlation) Let X and Y be random variables with distributions F1 and F2 and joint distribution F. The population Spearman’s rank correlation is given by s = (F1(X), F2(Y)), where  is the usual linear correlation.

  9. 2. Measures of dependence. • The sample estimator of the Spearman’s rank correlation is given by

  10. 2. Measures of dependence. • Example 2.1.

  11. 2. Measures of dependence. • Example 2.2.

  12. 2. Measures of dependence. • Definition 2.3. Let {(Xi, Yi)} 1≤ i ≤ n be a random sample from a population (X, Y) with cdf F(x, y). There are nC2 distinct pairs of (Xi, Yi) of observations in the sample, and each pair is either concordant or discordant. Let c be the number of concordant pairs and d be the number of discordant pairs. Then an estimate of Kendall’s rank correlation for the sample is given by t = (c – d) / (c + d). The population version of Kendall’s rank correlation is given by

  13. 2. Measures of dependence. • Note: the generalization of s and  to n > 2 is similar to the procedure for linear correlation. • Theorem 2.1. Let X and Y be random variables with cdf’s F1 and F2, and joint cdf F. Let  denote either s or . The following hold: •  is symmetric. • If X and Y are independent then  = 0. • -1 ≤  ≤ 1. • If T is strictly monotonic on the range of X, then

  14. 2. Measures of dependence. • Advantages of these measures: • Invariant under monotonic transformations. • Perfect dependence corresponds to measures 1;  = 1 iff Y = T(X) for some monotone increasing T. • Robust against outliers. • Disadvantages: • Not easy to manipulate; not moment based correlations.

  15. 3. The role of copulas. • Theorem 3.1. let X and Y be random variables with cdf’s F1(x) and F2(y) and joint cdf F(x, y). Let C(u, v) be the copula of X and Y. Then we have the following alternative formulas for Spearman’s and Kendall’s correlations respectively:

  16. 3. The role of copulas. • Proof: we will use the following identity due to Hoeffding:

  17. 3. The role of copulas. • Proof (cont.)the proof for Kendall’s identity will be presented in a more general setting. • Note: the proof of the last part of theorem 2.1 now follows easily from this theorem.

  18. 3. The role of copulas. • Example3.1. Farlie-Gumbel-Morgenstern Family of copulas. note that   [-2/9, 2/9]. Limited range of dependence!

  19. 3. The role of copulas. • Example 3.2. The Clayton-Cook-Johnson Family

  20. 3. The role of copulas. • Theorem 3.2. Let (X1, Y1), (X2, Y2) be random vectors with joint cdf’s H1, and H2 and common marginals F (of X1 and X2) and G (of Y1 and Y2); let C1 and C2 denote the copulas of (X1, Y1) and (X2, Y2) respectively. Let K be defined by: K = P[(X1 – X2)(Y1 – Y2) > 0] - P [(X1 – X2)(Y1 – Y2) < 0]then:

  21. 3. The role of copulas. Proof:

  22. 3. The role of copulas. Proof (cont.) The result follows.

  23. 3. The role of copulas. • Corollary 3.2. Under the hypothesis of theorem 3.1 the following hold: • K is symmetric in its arguments: K(C1, C2) = K(C2, C1) • K is non-decreasing in each argument: • Copulas can be replaced by survival copulas in K, i.e.

  24. 3. The role of copulas. • Remark 3.2.1. If C has a singular component the formula for  can not be computed. For such copulas the following expression is used:This is the consequence of the following theorem (Li et al., 2002): • Theorem 3.3. Let C1 and C2 be copulas. Then

  25. 3. The role of copulas. • Example 3.3. Recall M(u, v) = Min(u, v), W(u, v) = Max(u + v – 1, 0), (u, v) = uv. The support of M and W are the first and the second diagonals of I2. If g(u, v) is an integrable function with domain I2, then

  26. 3. The role of copulas. • Example 3.3 cont.It follows that:

  27. 3. The role of copulas. • If C is an Archimedean copula then the Kendall’stau can be evaluated directly using the generator. But first recall the theorem from last time:Let C be an Archimedean copula generated by . Then the distribution function of the random variable C(U, V) is given by

  28. 3. The role of copulas. • Theorem 3.4. Let X and Y be random variables with an Archimedean copula C generated by . Then the Kendall’s tau of X and Y is given by:Proof: see Nelsen.

  29. 3. The role of copulas. • Example 3.4. For the Gumbel family (t) = (-lnt). Therefore

  30. 4. Tail dependence (Dr. Lin’s request). • Definition 4.1. Let X and Y be random variables with cdf’s F1 and F2. The coefficient of upper (lower) tail dependence of X and Y is

  31. 4. Tail dependence • Can we use copulas to understand tail dependence? Recall the definition of survivor copula: C*(u, v) = u + v – 1 + C(1 – u, 1 – v). We now have:

  32. 4. Tail dependence Therefore we get for the upper tail dependence: Likewise for the lower tail dependence we get:

  33. 4. Tail dependence • Example 4.1. Gumbel copula:C(u, v) = exp[-{(-log(u)) + (-log(u)) }1/], u = 2 – 21/, l = 0.We see that for  > 1, C has tail dependence.

  34. 4. Tail dependence • Example 4.2. Gaussian family of copulas.We need a different formula for u. The following can be derived from the definition of conditional probability and therefore the proof is not presented.

  35. 4. Tail dependence • Example 4.2. (cont.)It can now be shown that Now if X and Y have bivariate normal distribution with correlation , it can be shown that u = 0. So the Gaussian copula does not have upper tail dependence.

  36. 4. Tail dependence • Theorem 4.1.Let C be a strict Archimedean copula. If -1’(0) is finite, then C(u, v) does not have upper tail dependence. If C has upper tail dependence, then -1’(0) = - , and the coefficient of upper tail dependence is given by • Theorem 4.2. let  be strict generator. The coefficient of lower tail dependence for the copula C is given by

  37. 4. Tail dependence • Example 4.3. Consider the Clayton family.C(u,v) = (u + v - 1)-1/. Then l = 2-1/;   0. • Example 4.4. It can be shown that the Frank family does not have any tail dependence.

More Related