TROUBLESOME CONCEPTS IN STATISTICS: r 2 AND POWER

TROUBLESOME CONCEPTS IN STATISTICS: r2 AND POWER N. Scott Urquhart Director, STARMAP Department of Statistics Colorado State University Fort Collins, CO 80523-1877

This research is funded by U.S.EPA – Science To Achieve Results (STAR) Program Cooperative Agreement # CR - 829095 STARMAP FUNDINGSpace-Time Aquatic Resources Modeling and Analysis Program The work reported here today was developed under the STAR Research Assistance Agreement CR-829095 awarded by the U.S. Environmental Protection Agency (EPA) to Colorado State University. This presentation has not been formally reviewed by EPA. The views expressed here are solely those of the presenter and STARMAP, the Program he represents. EPA does not endorse any products or commercial services mentioned in these presentation.

INTENT FOR TODAY • To discuss two topics which have given some of you a bit of confusion • r2 in regression • Power in the context of tests of hypotheses • Thanks for Ann Brock and Harriett Bassett for suggesting these topics • Approach: Visually illustrate the idea, • Then talk about the concepts illustrated • The sequences of graphs are available on the internet right now (address is at the end of this handout) • Questions are welcome

r2 IN REGRESSION • r2 provides a summary of the strength of a (linear) regression which reflects: • The relative size of the residual variability, • The slope of the fitted line, and • How good the observed values of the predictor variable are for prediction • Mainly the range of the Xs • Let’s see these features in action, then • Look at the formulas

WHAT MAKES r2 TICK? varying one thing, leaving the remaining things fixed r2 increases as residual variation decreases r2 increases as the slope increases r2 increases the range of x increases

WHAT IS r2? • r2 provides A measure of the fit of a line to a set of data which incorporates • The amount of residual variation, • The strength of the line (slope), and • How good the set of values of “x” are for estimating the line • Some areas of endeavor tend to overuse it!

HOW DOES r2 TELL US ABOUT VARIATION? • The following graph illustrates this: • The data scatter has r2 = 0.5 (approximately) • The red points have the same values, but all concentrated at X = 5. • {Strictly speaking the above formulas apply only in the case of bivariate regression.} • {Estimation formulas involve factors of n-1 and n-2.}

FORMULAS FOR r2 • But these have little intuitive appeal ! • We’ll decompose observations into parts: • Mean • Regression • Residual

DECOMPOSING REGRESSION • This is really n equations • Square each of these equations and add them up across i. • The three cross product terms will each add to zero. (Try it!)

DECOMPOSING REGRESSION(continued)

POWER OF A TEST OF HYPOTHESIS • Power = Prob(“Being right”) = Prob(Rejecting false hypothesis) • Power depends on two main things • The difference in the hypothesized and true situations, and • The strength of the information for making the test • Sample size is very important factor • In regression it depends on the same factors as the ones which increase r2. • Again, see it, then talk about it Power increases as D = m1 - m2 increases

POWER VARIES WITH DIFFERENCE (D = m1 - m2) and SAMPLE SIZE (n)

ON TESTS OF HYPOTHESES( ON THE WAY TO POWER) TRUE SITUATION HYPOTHESIS FALSE HYPOTHESIS TRUE ACTION FAIL TO REJECT THE NULL HYPOTHESIS TYPE II ERROR CORRECT ACTION REJECT THE NULL HYPOTHESIS TYPE I ERROR CORRECT ACTION Tests of hypotheses are designed to control a = Prob (Type I Error) While getting Power = 1- Prob (Type II Error) as large as possible

ON TESTS OF HYPOTHESES(AN ASIDE) • Which is worse, • a type I error, or • a type II error? • It depends tremendously on perspective • Consider the criminal justice system • Truth: Accused is innocent (HO) or guilty (HA) • Action: Accused is acquitted or convicted • Type I error = Convict an innocent person • Type II error = Acquit a guilty person • Which is worse? • Consider the difference in view of the • Accused • Society – especially if accused is terrorist

COMPUTING THE CRITICAL REGION • Consider a simple case X ~ N( m, 1) • HO: m = 4 versus HA: m¹ 4 • Critical Region (CR) is • X £ l and X ³ u , so • 0.025 = P(X£l)=P((X-4)/1 £ ( l - 4)/1)= P(Z £ -1.96) • l = 2.04, similarly, • u = 5.96

COMPUTING POWER • Consider a simple case X ~ N( m, 1) • HO: m = 4 versus HA: m¹ 4 • Power (at m = 5) = ? • = Prob(XA in CR| m = 5) • XA ~ N( 5, 1) • Prob(XA£ 2.04) + Prob(XA³ 5.96) • = Prob(Z £ -2.96) + Prob(Z ³ 0.96) • = 0.0015 + 0.1685 = 0.1700

COMPUTING POWER USING A MEAN BASED ON n = 2 OBSERVATIONS • Consider a simple case: • When the mean of two observations follows: • HO: m = 4 versus HA: m¹ 4 • Power (at m = 5) = ? • Critical Region (CR) is • £ l and ³ u , so • 0.025 = P(£l)= P(( -4)/0.707 £ (l - 4)/0.707)= P(Z £ -1.96) • Sol = 4 – (1.96)(0.707) = 2.61, similarly, u = 5.39

COMPUTING POWER USING A MEAN BASED ON n = 2 OBSERVATIONS(continued)

COMPUTING POWER USING A MEAN BASED ON n = 4 OBSERVATIONS(continued) (This page is not in the handout – so it all would fit on one page)

DIRECTIONAL NOTE • As the alternative has been two-sided throughout this presentation, the power curves are symmetric about the vertical axis. • By examining only the positive side, we can see the curves twice as large.

YOU HAVE ACCESS TO THESEPRESENTATIONS • You can find each of the slide shows shown here today at: • http://www.stat.colostate.edu/starmap/learning.html • Each show begins with authorship & funding slides • You are welcome to use them, and adapt them • But, please always acknowledge source and funding • You are free to reorder the graphs if it makes more sense for r2 to decrease than increase. • Urquhart is available to talk to AP Stat classes about statistics as a profession. • See content on the web site above.

TROUBLESOME CONCEPTS IN STATISTICS: r 2 AND POWER