1 / 13

Unit 4: Relationship between 2 Variables

Unit 4: Relationship between 2 Variables. By Tyler, Morgan, Jeff and Nick. Big Idea.

wirt
Download Presentation

Unit 4: Relationship between 2 Variables

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Unit 4: Relationship between 2 Variables By Tyler, Morgan, Jeff and Nick

  2. Big Idea • Two variable statistics are used in order to see if there is a relationship between two variables. So, this unit is focused on seeing if x truly does cause y to occur or if there is no true relationship between the two. Also, this unit focuses on discovering what kind of relationship is between x and y if there is one. You use this kind of math when you want to see what the relationship between two variables are and to find a model for that relationship. • There are three main types of relationships that we use in this unit: exponential growth, logarithmic growth, and power regression. These three types of relationships are tested for in different ways and are used to describe how x and y relate to each other. Also, the tests used in this chapter as well as the residual graphs used show whether there is a lurking or confounding variable messing up the relationship between the two.

  3. Key Vocabulary • Exponential growth (including exponential decay) - when the growth rate of a mathematical function is proportional to the function's current value • logarithmic growth- a logarithm function of some input. Logarithmic growth is the inverse of exponential growth and is very slow. • Lurking Variables- extraneous variables that can act as an explanation for the observed relationship between the independent variable(s) and the dependent variable • Power regression- taken from linear regression, but the difference is that in power regression you use the log of x and the log of y in the equation instead of x and y. • Causation- X directly causing Y • Common Response- X and Y responding to another different variable Z • Confounding variables - where you are not sure which variables (X or Z) is effecting a change from Y • Remember!!!!!!– Best way to establish cause and effect is through experimentation. • Simpson's paradox- a paradox in which a correlation present in different groups is reversed when the groups are combined. • R-This is a measure of how strong and how linear the correlation is between two variables. A r of 1 is a very strong positive linear correlation and a r of -1 is a very strong negative linear correlation. R is measured between -1 and 1. • R^2- This is a measure of how well the LSRE covers the original data. It is measured between 0 and 1. So, the closer to 1 r^2 is, the more of the original data is covered by the LSRE.

  4. Key Topics • Exponential growth- This is when the y value gains in value in increasing amounts as x increases. • Logarithmic growth- This is when the y value gains in value in decreasing amounts as x increases. • Power regression- This is very similar to linear regression except that you use the log of x and the log of y in the equation instead of just x and y. • Causation- This is very important in this unit. Correlation does not equal causation. This means that just because there is a very high r or r^2 value for your LSRE that does not that the x and y is actually related. They could be related through some other variable. So, for every LSRE you must make sure that the two variables are actually related and are not just reacting to other variables or are just coincidentally related.

  5. Key Formulas • Exponential test: This is a test used to see if there is linear or exponential relationship between the x and y variables. yn/yn-1. If the answer is 1, the correlation is linear. If it is <1, then it is exponential decay. If its is >1 it is exponential growth. • Exponential Formula- y=ab^x • To test for this, take the log of y and the normal x values and take the linear regression of that to get a LSRE then check that against a residual graph. • Power Regression Formula- y=a*x^b • You test for this by getting the LSRE of the log x and log y values then seeing if there is a pattern or not when graphed in a residual graph.

  6. We transform data to see if two variables that are very large in difference truly have any sort of relationship. The logs of the observations often carry the relationship, not the observations themselves. Logb(MN)=logbM+ logbN Logb(M/N)=logbM-LogbN In an exponential growth model, we are taking the log of y only. In a power log model, we are taking the logs of both X and Y. Calculator Keystrokes- Stat, right arrow, 9 for Natural Regression, 0 for Exponential Regression, and A for Power Regression. After you calculate your regression formula, go to STATPLOT, select your graph, go to Y list and press 2nd and Stat, and select Resid. Pg. 274 and 275 for exponential prediction and 281 and 282 for power log prediction.

  7. Calculator Key Strokes • You mostly should be using the stat button during these type of problems. • Your friends here are 1 value statistics and linear regression. They should be doing the bulk of your work for you during these questions. • Also, to graph your LSRE’s, just use y=.

  8. Regression Equations • Steps for Regression Equations • X and Y in L1 and L2 • Turn Diagnostics on • Determine best type of regression line based on r and r2 values • Graph residual plot; no pattern means correct graph • Helpful hints • For exponential growth, either log(Y/L2) and find LinReg(ax+b) or use 0:ExpReg • For power regression, either log(X/L1) and log(Y/L2) and find LinReg(ax+b) or use A:PwrReg • Option #2 is quicker

  9. Residuals • Differences between calculated y-values from graph and data • Check and see if there is a pattern on the dots on the graph • If there is a pattern, the LSRE is a bad fit and try another type of equation • If there is no pattern, your LSRE is a good fit

  10. Exponential Regression • Y=abx • Calculator Keystrokes • X in L1, Y in L2 • STAT, CALC, 0:ExpReg • Xlist, Ylist, Store RegEQ • L1, L2, Y1 • From Table • Y=2.902(1.410)x • a=2.902 • b=1.410 • r2=.907 • r=.952 • Graph Residuals • Pattern from residuals? • Yes

  11. Power Regression • Axb • Calculator Keystrokes • X in L1, Y in L2 • STAT, CALC, A:PwrReg • Xlist, Ylist, Store RegEQ • L1, L2, Y1 • From Table • Y=2.024(x)1.491 • a=2.024 • b=1.491 • r2=.9995 • r=.9998 • Graph Residuals • Pattern from residuals? • No • Correct Regression Equation

  12. Relationship between 2 categorical variables Row column is horizontal and column is vertical Simpson’s Paradox- An association/comparison between several groups can reverse when the data are combined into one group. Conditional Distribution- divide the row/column variable by Row/column total AKA marginal distribution. Example: To make a little more sense of Simpson's paradox, let's look at the following example. In a certain hospital there are two surgeons. Surgeon A operates on 100 patients, and 95 survive. Surgeon B operates on 80 patients and 72 survive. We are considering having surgery performed in this hospital and living through the operation is something that is important. We want to choose the better of the two surgeons. We look at the data and use it to calculate what percentage of surgeon A's patients survived their operations and compare it to the survival rate of the patients of surgeon B. 95 patients out of 100 survived with surgeon A, so 95/100 = 95% of them survived. 72 patients out of 80 survived with surgeon B, so 72/80 = 90% of them survived. From this analysis, which surgeon should we choose to treat us? It would seem that surgeon A is the safer bet. But is this really true? What if we did some further research into the data and found that originally the hospital had considered two different types of surgeries, but then lumped all of the data together to report on each of its surgeons. Not all surgeries are equal, some were considered high-risk emergency surgeries, while others were of a more routine nature that had been scheduled in advance. Of the 100 patients that surgeon A treated, 50 were high risk, of which three died. The other 50 were considered routine, and of these 2 died. This means that for a routine surgery, a patient treated by surgeon A has a 48/50 = 96% survival rate . Now we look more carefully at the data for surgeon B and find that of 80 patients, 40 were high risk, of which seven died. The other 40 were routine and only one died. This means that a patient has a 39/40 = 97.5% survival rate for a routine surgery with surgeon A. Now which surgeon seems better? If your surgery is to be a routine one, then surgeon B is actually the better surgeon. However if we look at all surgeries performed by the surgeons, A is better. This is quite counterintuitive. In this case the lurking variable of the type of surgery affects the combined data of the surgeons.

More Related