320 likes | 640 Views
Transformations. Transformations to Linearity. Many non-linear curves can be put into a linear form by appropriate transformations of the either the dependent variable Y or some (or all) of the independent variables X 1 , X 2 , ... , X p. This leads to the wide utility of the Linear model.
E N D
Transformations to Linearity • Many non-linear curves can be put into a linear form by appropriate transformations of the either • the dependent variable Y or • some (or all) of the independent variables X1, X2, ... , Xp . • This leads to the wide utility of the Linear model. • We have seen that through the use of dummy variables, categorical independent variables can be incorporated into a Linear Model. • We will now see that through the technique of variable transformation that many examples of non-linear behaviour can also be converted to linear behaviour.
Intrinsically Linear (Linearizable) Curves 1Hyperbolas y = x/(ax-b) Linear form: 1/y = a -b (1/x) or Y = b0 + b1 X Transformations: Y = 1/y, X=1/x, b0 = a, b1 = -b
2.Exponential y = aebx = aBx Linear form: ln y = lna + b x = lna + lnB x or Y = b0 + b1 X Transformations: Y = ln y, X = x, b0 = lna, b1 = b = lnB
3. Power Functions y = a xb Linear from: ln y = lna + blnx or Y =b0 + b1 X
Logarithmic Functions y = a + b lnx Linear from: y = a + b lnx or Y =b0+ b1X Transformations: Y = y, X = ln x,b0 = a,b1= b
Other special functions y = aeb/x Linear from: ln y = lna + b 1/x or Y =b0 +b1X Transformations: Y = ln y, X = 1/x,b0= lna, b1= b
Polynomial Models y = b0 + b1x + b2x2 + b3x3 Linear form Y = b0 + b1 X1 + b2 X2 + b3 X3 Variables Y = y, X1 = x , X2 = x2, X3 = x3
Linear form lny = b0 + b1 X1 + b2 X2 + b3 X3+ b4 X4 Y = lny, X1 = x , X2 = x2, X3 = x3, X4 = x4 Exponential Models with a polynomial exponent
b0, d1, g1, … , dk, gk are parameters that have to be estimated, • n1, n2, n3, … , nk are known constants (the frequencies in the trig polynomial. Note:
Trigonometric Polynomial Models y = b0 + g1cos(2pn1x) + d1sin(2pn1x) + … + gkcos(2pnkx) + dksin(2pnkx) Linear form Y = b0 + g1 C1 + d1 S1 + … + gk Ck + dk Sk Variables Y = y, C1 = cos(2pn1x) , S2 = sin(2pn1x) , … Ck = cos(2pnkx) , Sk = sin(2pnkx)
Dependent variable Y and two independent variables x1 and x2. (These ideas are easily extended to more the two independent variables) Response Surface models The Model (A cubic response surface model) or Y = b0 + b1X1 +b2X2 + b3X3 +b4X4 + b5X5 + b6X6 + b7X7 + b8X8 + b9X9+ e where
y up y up x down x up x down x up y down y down The Bulging Rule
Non-Linear Models Nonlinearizable models
Non-Linear Growth models The Mechanistic Growth Model • many models cannot be transformed into a linear model Equation: or (ignoring e)“rate of increase in Y”=
The Logistic Growth Model Equation: or (ignoring e) “rate of increase in Y”=
The Gompertz Growth Model: Equation: or (ignoring e)“rate of increase in Y”=
Example: daily auto accidents in Saskatchewan to 1984 to 1992 Data collected: • Date • Number of Accidents Factors we want to consider: • Trend • Yearly Cyclical Effect • Day of the week effect • Holiday effects
Trend This will be modeled by a Linear function : Y = b0 +b1X (more generally a polynomial) Y = b0 +b1X +b2X2+ b3X3+ …. Yearly Cyclical Trend This will be modeled by a Trig Polynomial – Sin and Cos functions with differing frequencies(periods) : Y = d1 sin(2pf1X)+ g1 cos(2pf2X)+ d1 sin(2pf2X) + g2 cos(2pf2X) + …
Day of the week effect: This will be modeled using “dummy”variables : a1D1 + a2D2 + a3D3 + a4D4 + a5D5 + a6D6 Di = (1 if day of week = i, 0 otherwise) Holiday Effects Also will be modeled using “dummy”variables :
Independent variables X = day,D1,D2,D3,D4,D5,D6,S1,S2,S3,S4,S5, S6,C1,C2,C3,C4,C5,C6,NYE,HW,V1,V2,cd,T1, T2. Si=sin(0.017202423838959*i*day). Ci=cos(0.017202423838959*i*day). Dependent variable Y = daily accident frequency
Independent variables ANALYSIS OF VARIANCE SUM OF SQUARES DF MEAN SQUARE F RATIO REGRESSION 976292.38 18 54238.46 114.60 RESIDUAL 1547102.1 3269 473.2646 VARIABLES IN EQUATION FOR PACC . VARIABLES NOT IN EQUATION STD. ERROR STD REG F . PARTIAL F VARIABLE COEFFICIENT OF COEFF COEFF TOLERANCE TO REMOVE LEVEL. VARIABLE CORR. TOLERANCE TO ENTER LEVEL (Y-INTERCEPT 60.48909 ) . day 1 0.11107E-02 0.4017E-03 0.038 0.99005 7.64 1 . IACC 7 0.49837 0.78647 1079.91 0 D1 9 4.99945 1.4272 0.063 0.57785 12.27 1 . Dths 8 0.04788 0.93491 7.51 0 D2 10 9.86107 1.4200 0.124 0.58367 48.22 1 . S3 17 -0.02761 0.99511 2.49 1 D3 11 9.43565 1.4195 0.119 0.58311 44.19 1 . S5 19 -0.01625 0.99348 0.86 1 D4 12 13.84377 1.4195 0.175 0.58304 95.11 1 . S6 20 -0.00489 0.99539 0.08 1 D5 13 28.69194 1.4185 0.363 0.58284 409.11 1 . C6 26 -0.02856 0.98788 2.67 1 D6 14 21.63193 1.4202 0.273 0.58352 232.00 1 . V1 29 -0.01331 0.96168 0.58 1 S1 15 -7.89293 0.5413 -0.201 0.98285 212.65 1 . V2 30 -0.02555 0.96088 2.13 1 S2 16 -3.41996 0.5385 -0.087 0.99306 40.34 1 . cd 31 0.00555 0.97172 0.10 1 S4 18 -3.56763 0.5386 -0.091 0.99276 43.88 1 . T1 32 0.00000 0.00000 0.00 1 C1 21 15.40978 0.5384 0.393 0.99279 819.12 1 . C2 22 7.53336 0.5397 0.192 0.98816 194.85 1 . C3 23 -3.67034 0.5399 -0.094 0.98722 46.21 1 . C4 24 -1.40299 0.5392 -0.036 0.98999 6.77 1 . C5 25 -1.36866 0.5393 -0.035 0.98955 6.44 1 . NYE 27 32.46759 7.3664 0.061 0.97171 19.43 1 . HW 28 35.95494 7.3516 0.068 0.97565 23.92 1 . T2 33 -18.38942 7.4039 -0.035 0.96191 6.17 1 . ***** F LEVELS( 4.000, 3.900) OR TOLERANCE INSUFFICIENT FOR FURTHER STEPPING