1 / 19

Correlation, dependence and copulas

Correlation, dependence and copulas. Multivariate distributions Covariance and correlation Dependence vs. correlation Copulas Most slides borrowed from Chanyoung Park. Joint density distribution function. With two random variables X and Y

len-morales
Download Presentation

Correlation, dependence and copulas

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Correlation, dependence and copulas • Multivariate distributions • Covariance and correlation • Dependence vs. correlation • Copulas Most slides borrowed from Chanyoung Park

  2. Joint density distribution function • With two random variables X and Y • Definition means that the integral of the joint PDF from minus infinity to infinity on both variables is 1.

  3. Marginal PDF • Bivariate RVs • 6 randomly generated RVs: {(1,2) (2,4) (4,5) (2,2) (3,4) (3,2)} • Sampling only for X1: {1,2,4,2,3,3} → A PDF for only X1? • Sampling only for X2: {2,4,5,2,4,2} → A PDF for only X2? • Marginal PDF • A PDF for only one RV • Marginal distribution is independent of whether RVs are correlated.

  4. Marginal PDF does not reflect correlation • Two same marginal distributions • Two RVs X1 and X2 and their marginal distributions • Correlated RVs (Bivariate RVs) uncorrelated RVs

  5. Is there any correlations between Economic freedom and income? • Note: I (chanyoung) am not a supporter of free markets

  6. Correlated Variables • For normal distribution can use Matlab’smvnrnd • R = MVNRND(MU,SIGMA,N) returns a N-by-D matrix R of random vectors chosen from the multivariate normal distribution with 1-by-D mean vector MU, and D-by-D covariance matrix SIGMA.

  7. Example mu = [2 3]; sigma = [1 1.5; 1.5 3]; r = mvnrnd(mu,sigma,20); plot(r(:,1),r(:,2),'+') What is the correlation coefficient?

  8. Correlation images • Correlated two random variables • Two RVs X1 and X2 • Plotting 1000 pairs of X1 and X2 • Correlation is not the slope of linear regression of two RVs. Why? • Correlated RVs (Bivariate RVs) uncorrelated RVs

  9. Probability with bivariate RVs • Probability • How to calculate probability of P(X1<x1), P(X2<x2)and P(X<x1, X<x2) • Correlated RVs (Bivariate RVs) Independent RVs x2 x2 x1 x1 P(X1<x1) = P(X1<x1) P(X2<x2) = P(X2<x2) P(X<x1, X<x2) ≠ P(X<x1, X<x2)

  10. Statistical independence • Two events A and B are independent if the occurrence of one event does not change the probability of the other event. • The two events are independent if and only if • Similarly two random variables X,Y are independent if and only if

  11. Conditional probability • The PDF of X for specified y is the conditional probability of X given y • If X and Y are independent

  12. Example of dependent and uncorrelated • Let X be N(0,1) and let • Then • Can you show with simple example that these two variables are not independent?

  13. Describing efficiently joint distributions • When the joint distribution is normal the means and covariance matrix do the job. • When the joint distribution is not normal, we look for other devices, and a copula function is the current favorite. • Note that the marginal distributions can be normal but the joint distribution is not normal. How can that happen?

  14. How to calculate probability of bivariate RVs • Copula • For independent RVs, P(X1<x1, X2<x2)= P(X1<x1) P(X2<x2) • For correlated RVs, P(X<x1, X<x2) = C(P(X1<x1), P(X2<x2), θ) • C is a copula function and θ is a correlation coefficient to get the P(X<x1, X<x2) • Correlation coefficient • Measures for correlation of bivariate RVs: Linear correlation coefficient (Pearson’s Rho), Kendal’s Tau and Spearman’s Rho • Linear correlation coefficient: Only capable of measuring linear relationship, it is unduly influenced by outliers • Kendal’s Tau: Using probability of correlation. (using probability of concordance and probability of discordance) • Spearman’s Rho: Using the same concept with Pearson’s Rho but it uses ranks of data rather than correlation of data.

  15. Kendall’s tau coefficient • Let (x1, y1), (x2, y2), …, (xn, yn) be a set of joint observations from two random variables X and Y respectively. • A pair of observations (xi, yi) and (xj, yj) are concordant if the ranks for both elements agree: that is, if both xi > xj and yi > yj or if both xi < xj and yi < yj. E.g (1,2) (3,19) • They are discordant, if xi > xj and yi < yj or if xi < xj and yi > yj. (1,2), (2,1) • If xi = xj or yi = yj, the pair is neither concordant nor discordant. • The Kendall τ coefficient is defined as:

  16. Example • Consider X=U[0,1] and • What is Kendall’s tau? • What is the linear correlation coefficient?

  17. Copula models • Elliptical Copula • Student-t copula • Gaussian copula (Linear correlation coefficient); copula function is implicit • Archimedian Copula • Explicit copula function • Clayton, Gumbel, Frank copula (Kendall’s tau) • Joint PDFs of copulas • Standard normal distributions are used as marginal PDFs • Kendall’s tau = 0.5 • Gaussian Clayton Gumbel Frank

  18. How to fit copula? • From samples • Find marginal distributions; Find PDFs of best fit for each RV using Goodness of fit (GOF) test such as Kolmogorov-Smirnoff (K-S) test • Calculate a correlation coefficient • Find a copula of best fit for the samplesusing GOF test • When would you want to fit a copula?

  19. How to generate bivariate RVs? • Generate bivariate RVs using MATLAB • Generate N samples • Frank copula with Kendall’s tau of 0.5with two standard normal distribution for marginal PDFs • N=5; family= 'Frank’;Ktau=0.5;alpha = copulaparam(family,Ktau);U = copularnd(family,alpha,N);X=norminv(U,0,1) • U = copularnd(FAMILY,ALPHA,N) returns N random vectors generated from the bivariate Archimedean copula determined by FAMILY, with scalar parameter ALPHA. FAMILY is 'Clayton', 'Frank', or 'Gumbel'. U is an N-by-2 matrix. Each column of U is a sample from a Uniform(0,1) marginal distribution. X = 0.8954 -0.1639 1.3153 0.5446 -1.1408 -0.7823 1.3618 2.2497 0.3381 1.6897

More Related