130 likes | 398 Views
Data Analysis Using R: Multiple linear regression analysis. Tuan V. Nguyen Garvan Institute of Medical Research, Sydney, Australia. Data. Effects of temperature (x1), time (x2) on CO2 production.
E N D
Data Analysis Using R:Multiple linear regression analysis Tuan V. Nguyen Garvan Institute of Medical Research, Sydney, Australia
Data • Effects of temperature (x1), time (x2) on CO2 production. Chú thích: y = sản lượng CO2; X1 = thời gian (phút); X2 = nhiệt độ (C); X3 = phần trăm hòa tan; X4 = lượng dầu (g/100g); X5 = lượng than đá; X6 = tổng số lượng hòa tan; X7 = số hydrogen tiêu thụ.
Read data into R y <- c(36.98,13.74,10.08, 8.53,36.42,26.59,19.07, 5.96,15.52,56.61, 26.72,20.80, 6.99,45.93,43.09,15.79,21.60,35.19,26.14, 8.60, 11.63, 9.59, 4.42,38.89,11.19,75.62,36.03) x1 <- c(5.1,26.4,23.8,46.4, 7.0,12.6,18.9,30.2,53.8,5.6,15.1,20.3,48.4, 5.8,11.2,27.9,5.1,11.7,16.7,24.8,24.9,39.5,29.0, 5.5, 11.5, 5.2,10.6) x2 <- c(400,400, 400, 400, 450, 450, 450, 450, 450, 400, 400, 400, 400, 425, 425, 425, 450, 450, 450, 450, 450, 450, 450, 460, 450, 470, 470) x3 <- c(51.37,72.33,71.44,79.15,80.47,89.90,91.48,98.60,98.05,55.69, 66.29,58.94,74.74,63.71,67.14,77.65,67.22,81.48,83.88,89.38, 79.77,87.93,79.50,72.73,77.88,75.50,83.15) x4 <- c(4.24,30.87,33.01,44.61,33.84,41.26,41.88,70.79,66.82, 8.92,17.98,17.79,33.94,11.95,14.73,34.49,14.48,29.69,26.33, 37.98,25.66,22.36,31.52,17.86,25.20, 8.66,22.39) x5 <- c(1484.83, 289.94, 320.79, 164.76, 1097.26, 605.06, 405.37, 253.70, 142.27,1362.24, 507.65, 377.60, 158.05, 130.66, 682.59, 274.20, 1496.51, 652.43, 458.42, 312.25, 307.08, 193.61, 155.96,1392.08, 663.09,1464.11, 720.07) x6 <- c(2227.25, 434.90, 481.19, 247.14,1645.89, 907.59, 608.05, 380.55, 213.40,2043.36, 761.48, 566.40, 237.08,1961.49,1023.89, 411.30,2244.77, 978.64, 687.62, 468.38, 460.62, 290.42, 233.95,2088.12, 994.63,2196.17,1080.11) x7 <- c(2.06,1.33,0.97,0.62,0.22,0.76,1.71,3.93,1.97,5.08,0.60,0.90, 0.63,2.04,1.57,2.38,0.32,0.44,8.82,0.02,1.72,1.88,1.43, 1.35,1.61,4.78,5.88) REGdata <- data.frame(y, x1,x2,x3,x4,x5,x6,x7)
Results of analysis > reg <- lm(y ~ x1+x2+x3+x4+x5+x6+x7, data=REGdata) > summary(reg) Residuals: Min 1Q Median 3Q Max -20.035 -4.681 -1.144 4.072 21.214 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 53.937016 57.428952 0.939 0.3594 x1 -0.127653 0.281498 -0.453 0.6553 x2 -0.229179 0.232643 -0.985 0.3370 x3 0.824853 0.765271 1.078 0.2946 x4 -0.438222 0.358551 -1.222 0.2366 x5 -0.001937 0.009654 -0.201 0.8431 x6 0.019886 0.008088 2.459 0.0237 * x7 1.993486 1.089701 1.829 0.0831 . --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 10.61 on 19 degrees of freedom Multiple R-Squared: 0.728, Adjusted R-squared: 0.6278 F-statistic: 7.264 on 7 and 19 DF, p-value: 0.0002674
Stepwise regression reg <- lm(y ~ ., data=REGdata) step(reg, direction=”both”)
Bayesian Model Average (BMA) analysis library(BMA) xvars <- REGdata[,-1] co2 <- REGdata[,1] bma <- bicreg(xvars, co2, strict=FALSE, OR=20) summary(bma)
Bayesian Model Average (BMA) analysis 16 models were selected Best 5 models (cumulative posterior probability = 0.6599 ): p!=0 EV SD model 1 model 2 model 3 model 4 model 5 Intercept 100.0 5.75672 14.6244 2.5264 6.1441 8.6120 7.5936 7.3537 x1 12.4 -0.01807 0.1008 . . . -0.1393 . x2 10.4 -0.00075 0.0282 . . . . . x3 10.7 0.00011 0.0791 . . . . -0.0572 x4 20.2 -0.03059 0.1020 . . -0.1419 . . x5 10.5 -0.00023 0.0030 . . . . . x6 100.0 0.01815 0.0040 0.0185 0.0193 0.0164 0.0162 0.0179 x7 73.7 1.60766 1.2821 2.1857 . 2.1628 2.1233 2.2382 nVar 2 1 3 3 3 r2 0.700 0.636 0.709 0.704 0.701 BIC -25.8832 -24.0238 -23.4412 -22.9721 -22.6801 post prob 0.311 0.123 0.092 0.072 0.063
Bayesian Model Average (BMA) analysis imageplot.bma(bma)