280 likes | 542 Views
chapter three. Characteristics of Probability Distributions. Summary Characteristics. Expected Value - Central Tendency Variance - Dispersion Covariance – do two variables move together? Correlation Coefficient – linear association Conditional Expectation & Variance Skewness and Kurtosis.
E N D
chapter three Characteristics of Probability Distributions
Summary Characteristics • Expected Value - Central Tendency • Variance - Dispersion • Covariance – do two variables move together? • Correlation Coefficient – linear association • Conditional Expectation & Variance • Skewness and Kurtosis
Expected Value • The expected value of a discrete r.v. is given by • E(X) = ∑X xf(X) • The weighted average of the possible values of a r.v., where the probabilities are the weights • Sometimes called the average or mean, but technically the population mean value • Roll a die numbered 1 thru 6. What is the expected value of the number shown? See Table 3-1 and the graph in Fig. 3-1.
Table 3-1 The expected value of a random variable X,the number shown on a die.
Figure 3-1 The expected value, E(X), of a discrete randomvariable (Example 3.1).
Properties of the Expected Value • The expected value of a constant is that constant • E(b) = b or E(2) = 2 • The expectation of a sum of r.v.s is the sum of the expectations of each r.v. • E(X + Y) = E(X) + E(Y) • E(X/Y) ≠ E(X)/E(Y) and E(XY) ≠E(X)E(Y) • EXCEPT for independent r.v.s, E(XY) = E(X)E(Y) • E(X2) ≠ [E(X)]2 • For any constant a, E(aX) = aE(X) • For constants a and b, E(aX + b) = aE(X) + b
E. V. of Multivariate PMF or PDF • E(XY) = ∑x∑yxyf(X,Y) • Multiply each pair of values by their joint probability and sum over all values of X and Y. • For continuous r.v.s, the summation signs are replaced by integral signs. • Example: What is E(XY) from Table 2-3? • The answer should be 7.06.
Table 2-3: Measures of Joint Probabilities The bivariate probability distribution of number of PCssold (X) and number of printers sold (Y).
Variance: Measure of Dispersion • Var(X) = σx2 = E(X – μx)2 where E(X) = μx • The positive square root of σx2 is the standard deviation, σ or s.d. • To compute the variance var(X) = ∑X (X - μx)2f(X) • Find the variance and standard deviation for the example of rolling a die. See Tables 3-1 and 3-2.
Figure 3-2 Hypothetical PDFs of continuous random variablesall with the same expected value.
Table 3-2 The variance of a random variable X,the number shown on a die.
Properties of Variance • The variance of a constant is zero. • If X and Y are independent • var(X + Y) = var(X) + var(Y) • var(X - Y) = var(X) + var(Y) • If b is a constant, var(X + b) = var(X) • If a is a constant, var(aX) = a2var(X) • If a and b are constants, var(aX + b) =a2var(X) • If X and Y are independent • var(aX + bY) = a2var(X) +b2var(Y) • Other formulas: var(X) = E(X2) – [E(X)]2 , where E(X2) = ∑xx2f(x) • For continuous r.v.s replace ∑ with ∫
Chebyshev’s Inequality • How well do the expected value and variance of a r.v. describe its PF? • If X is a r.v. with mean μx and variance σx2, then for any positive constant c the probability that X lies inside the interval [μx - cσx, μx + cσx] is at least 1- (1/c2). • P[|X - μx| <cσx] > 1 – (1/c2) • Note that we do not need to know the actual PMF or PDF of the r.v. X. • This works well for c > 1.
Example • Suppose a donut shop sells 100 donuts on average between 8 and 9 a.m. with a variance of 25. What is the probability that the number of donuts sold between 8 and 9 a.m. on a given day lies between 90 and 110? • Use Chebyshev’s Inequality • P[|X - μx| <cσx] > 1 – (1/c2)
Coefficient of Variation • The variance and s.d. depend on the units of measurement, making comparisons of two or more s.d.s difficult if the units are different. • The coefficient of variation (V), a measure of relative variation, solves the problem of units of measurement • V = (σx/μx)(100) • Or the ratio of the s.d. to the mean times 100 • See Table I in Schooltrans.doc
Covariance • A Characteristic of Multivariate PFs • Cov(X, Y) = E[(X – μx)(Y – μy)] = E(XY) - μxμy • A measure of how two variables move together • positive – same direction • negative – opposite directions • zero – no linear relationship. • Compute covariance • Cov(X,Y) = ∑x∑y(X – μx)(Y – μy)f(X, Y) = E(XY) - μxμy • Where E(XY) = ∑x∑yxyf(X,Y) • What is cov(X,Y) for Table 2 – 3?
Table 2-3: Measures of Joint Probabilities The bivariate probability distribution of number of PCssold (X) and number of printers sold (Y).
Properties of Covariance • The covariance of two independent r.v.s is zero, since E(XY) = μxμy in this case. • For constants a, b, c, d, • cov(a + bX, c + dY) = bdcov(X,Y) • cov(X,X) = var(X) • If X and Y are not independent • var(X + Y) = var(X) + var(Y) + 2cov(X,Y) • var(X – Y) = var(X) + var(Y) – 2cov(X,Y)
Correlation Coefficient • How strongly are two variables linearly related? • Coefficient of Correlation (population) • ρ = cov(X,Y)/σxσy • Properties • -1 < ρ < 1 and takes the same sign as the covariance • ρ = 1 then Y = B1 + B2X, perfectly positively linearly related • ρ = -1, perfectly negatively linearly related • ρ is a pure number, devoid of units • If two r.v.s are independent their correlation coeff. is zero, BUT the converse is not true. If Y = X2, ρ may be zero, but the two r.v.s are not independent (here they are nonlinearly related). • Correlation does not imply causality.
Figure 3-3 Some typical patterns of the correlation coefficient, ρ.
Example • Find the correlation coefficient for printer and PC sales from Table 2 – 3 • cov(X,Y) = 0.95, σx = 1.2649, σy = 1.4124 • ρ = (0.95)/[(1.2649)(1.4124)] = 0.5317 • Note: • Cov(X,Y) = ρσxσy
Table 2-3: Measures of Joint Probabilities The bivariate probability distribution of number of PCssold (X) and number of printers sold (Y).
Conditional Expectation and Variance • E(X) is the unconditional expectation of X • The conditional expectation is the expectation of X given Y equal to some value • E(X|Y=y) = ∑xXf(X|Y = y) and also • E(Y|X=x) = ∑yYf(Y|X = x) • Compute E(Y|X = 2) from Table 2-3 • Note E(Y|X=2) = ∑yYf(Y|X = 2) = ∑yY[f(Y=y, X=2)/f(X=2)] • Conditional Variance • var(Y|X = x) = ∑i [Yi – E(Y|X = x)]2f(Y|X=x)
Skewness and Kurtosis • Skewness: a measure of symmetry • S = [E(X – μx)3]/σx3 • Or, S = (third moment about the mean)/(s.d. cubed) • For a symmetrical PDF, S=0 (all odd order moments are zero); S > 0, skew right; S < 0 skew left. • Kurtosis: measure of tallness or flatness • K = [E(X – μx)4]/σx4 • K < 3 platykurtic (short-tailed), K > 3 leptokurtic (long-tailed), K= 3 mesokurtic • Moments • rth moment: [E(X – μx)r] for r>1 • Compute as ∑(X – μx)rf(X) • 1st moment is the mean; 2nd is the variance.
Figure 3-4a Skewness
Figure 3-4b Kurtosis
Sample Moments • Sample mean (average) • Sample Variance • Sample Covariance
Sample Moments • Sample correlation coefficient • r = (sample cov)/(σxσy) • 3rd moment • [∑(X – Xbar)3]/(n – 1) • 4th moment • [∑(X – Xbar)4]/(n – 1) • Xbar is the sample mean