1 / 26

USE OF GENERALIZED LINEAR MODEL IN FORECASTING OF AIR PASSENGERS CONVEYANCES FROM EU COUNTRIES

USE OF GENERALIZED LINEAR MODEL IN FORECASTING OF AIR PASSENGERS CONVEYANCES FROM EU COUNTRIES. Catherine Zhukovskaya Faculty of Transport and Mechanical Engineering Riga Technical University. Outline. Introduction Informative base

augusta
Download Presentation

USE OF GENERALIZED LINEAR MODEL IN FORECASTING OF AIR PASSENGERS CONVEYANCES FROM EU COUNTRIES

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. USE OF GENERALIZED LINEAR MODEL IN FORECASTING OF AIR PASSENGERS CONVEYANCES FROM EU COUNTRIES Catherine Zhukovskaya Faculty of Transport and Mechanical Engineering Riga Technical University

  2. Outline • Introduction • Informative base • Used models for analyzing and forecasting of the air passengers’ conveyances • Elaboration of linear models • Elaboration of generalized linear models • Conclusion • References

  3. 1. Introduction • Most the literature which is devoted to forecasting of transport flows contain only simple forecasting models on the base of the time series methods [Hünt (2003)] or linear regression methods with small number of explanatory variables [Butkevičius, Vyskupaitis (2005), Šliupas (2006)]. • Two different approaches for the forecasting of air passengers conveyances from EU countries were considered in this investigation: • the classical method of linear regression; • the generalized linear model (GLM). • The aim of this investigation is to illustrate the advantage of using the GLM comparing with the simple linear regression models. • The verification of the models and the evaluation of the unknown parameters are included as well. • All calculations are being done with Statistica 6.0 and elaborated computer software in MathCad 12.

  4. 2. Informative base Factors • The forecasted variable was the number of air passenger carried, expressed in millions of passengers. t1 - total population of the country (TP), millions of inhabitants; t2 - area of the country (AREA), thousands of km2; t3 - density of the country population (PD), number of inhabitants per km2; t4 - monthly labour costs (MLC), thousands of euros; t5 - gross domestic product (GDP) “per capita” in Purchasing Power Standards (PPS) (GDP_PPS); t6 - gross domestic product (GDP), billions of euro; t7 - comparative price level (CPL); t8 - inflation rate (IR); t9 - unemployment rate (UR); t10 - labour productivity per hour worked (LPHW).

  5. The following 25 countries of EU were selected: Belgium, Czech Republic, Denmark, Germany, Estonia, Greece, Spain, France, Ireland, Italy, Cyprus, Latvia, Lithuania, Luxembourg, Hungary, Malta, Netherlands, Austria, Poland, Portugal, Slovenia, Slovakia, Finland, Sweden and United Kingdom. • The considered period was from 1996 to 2005. • All data for this investigation have been received from the electronic database“The Statistical Office of the European Communities” (EUROSTAT) http://epp.eurostat.ec.europa.eu • The final number of the observation was 161: • Data for the period from 1996 to 2004 have been used for the estimation and forecasting - 140 observations; • Data of the 2005 have been used for the check out of the quality of forecasting, so called the cross-validation (CV) - 21 observations.

  6. 3. Used models for analyzing and forecasting of the air passengers’ conveyances Main notions • The data about concrete country for the concrete year were taken as the observation. • The main object of the consideration was the air passengers’ conveyances from EU countries. • All the considered models were the groupmodels [Andronov (1983)]. • Classification of regressional models according to their mathematical form: • Linear regression models; • Generalized linear regression models (GLM).

  7. The linear regression model [Hardle (2004)]: E(Y(k)(x)) = xT, (1) where: • Y(k) is a dependent variable for the k-th considered model; • x = (x1, x2, …, xd)T is d-dimensional vector of explanatory variables; •  = (0, 1, 2, …, d)T is a coefficient vector that has to be estimated from observations for Y(k) and x. • The generalized linear regression model: E(Y(k)(x)) = G{xT}, (2) where G() is the known function of the one dimensional variable.

  8. 4. Elaboration of linear models • The basic criteria for the best model choosing: • Multiple coefficient of determination (R2); • Fisher criterion (F); • Sum of the squares of the residuals (SSRes); • Sum of the squares of residuals for the cross-validation (CV SSRes). • For the checking of the statistical hypotheses we always used the statistical significancelevel  = 0.05. MODEL #1Y(1) = 0 + 1x1 + 2x2 + 3x3 + 4x4 + 5x5 + 6x6+ 7x7 + 8x8 + 9x9 + 10x10,where Y(1) is the total number of air passenger carried;x1 = t1, x2 = t2, x3 = t3, x4 = t4, x5 = t5, x6=t6, x7 = t7, x8=t8, x9=t9, x10=t10.

  9. Results for the MODEL #1 Ê(Y(1)(x)) = 14 – 0,77x1 + 0,16x2+185,8x3-2,44x4+ 0,53x5+ 0,07x6 + 0,05x7+ +0,32x8-1,2x9- 1,03x10 Table 1 . . R2 = 0.831 Fisher criterion F = 63.49

  10. New factor 0, if the considered country is the old member of EU; 1, if the considered country is the new one. t11 (ON) = MODEL #2Y(2) = 0 + 1x1 + 2x2 + 3x3 + 4x4 + 5x5,where Y(2) = Y(1);x1 = t2, x2 = t3, x3 = t6, x4 = t10, x5 = t11. Results for the MODEL #2 Ê(Y(2)(x)) = 13.56 + 0,09x1 + 134,01x2+0,05x3- 0,68x4+ 29,36x5. Table 2 R2 = 0.829 Fisher criterion F = 129.85

  11. Modifications of factors MODEL #3Y(2) = 0 + 1x1 + 2x2 + 3x3 + 4x4 + 5x5,where Y(3) = Y(1); Results for the MODEL #3 Ê(Y(3)(x)) = -6,34 + 113,26x1 + 0,14x2-0,52x3- 0,03x4+ 3,03x5 Table 3 R2 = 0.867 Fisher criterion F = 174.08

  12. Analysis of observed and predicted valuesfor the MODEL #3 1 2 Figure 1. Plot of observed and predicted values. Figure 2. Plot of observed and predicted values for the CV.

  13. MODEL #4Y(4) = 0 + 1x1 + 2x2 + 3x3 + 4x4 + 5x5 + 6x6 + 7x7 + 8x8 + 9x9,where Y(4) = Y(1)/t1 - the ratio between the total number of air passenger carried and the number of inhabitants of the country; Results for the MODEL #4 Ê(Y(4)(x)) = 0,56 + 2,33x1-1,04x2-0,02x3+ 0,001x4+ 1,76x5-0,0004x6+ +0,04x7+ 0,17x8. Table 4 R2 = 0.760 Fisher criterion F = 45.81

  14. New factor 0, if the value y/t1 for the considered country is small (less than 2); 1, if the value y/t1 is larger than 2. t12 (HL) = MODEL #5Y(2) = 0 + 1x1 + 2x2 + 3x3 + 4x4 + 5x5+ 6x6 + 7x7 + 8x8,where Y(5) = Y(4); Results for the MODEL #5 Ê(Y(5)(x)) = 0,99 - 0,46x1-0,02x2-0,02x3- 0,02x4+ 0,01x5 +1,27x6+ 1,15x7+ 0,07x8 Table 5 R2 = 0.864 Fisher criterion F = 104.174

  15. Pivot results for the linear regression models Table 6

  16. Analysis of observed and predicted valuesfor the MODEL #5 3 4 Figure 3. Plot of recalculated observed and predicted values. Figure 4. Plot of recalculated observed and predicted values for the CV.

  17. 4. Elaboration of generalized linear models • For the further investigation the best linear regression model (Model #5) has been chosen • Two different GLM were considered. In both of them the value of the regressand Y(GLM) = Y(5) / t1 and the collection of the regressors are the same as for Model #5. GLM1 (3) where hiis the total population number, xi is vector-columns of the independent variables, i is the observation number, i = 1, 2, …, n. GLM2 (4) where a is additional parameter (constant).

  18. (5) • For unknown parameter vector  estimation we used the least squares criterion where Yi and Ŷi are observed and calculated values of Y. 1. Linearization LM1 (6) LM2 (7) where Y* = Y/ h.

  19. The models LM1 and LM2 give the following estimate for E(Y) • The values of SSRes and CV SSRes for the Model #5 and LM Table 7 • We can see that linearization gives bad results. Making attempts to improve the obtained results a two-stage estimation procedure was developed. • The first stage corresponds to the considered linearization. As the second step we used the procedure of calibration when we precise the gotten estimates by using the well-known gradient method.

  20. 2. Calibration • Gradients for the least squares criterion GLM1 (8) GLM2 (9)

  21. For the GLM2 we found the optimum value of R0 not only from the values  but from the parameter  also. The GLM1 and GLM2 have the following estimates for E(Y): Table 8

  22. Analysis of observed and predicted valuesfor the GLM 5 6 Figure 5. Plot of observed and predicted values. Figure 6. Plot of observed and predicted values for the CV.

  23. Dependence of values SSRes and CV SSRes from the value of parameter  for GLM2 7 Figure 7. The values of SSRes and CV SSRes as a function of parameter  for GLM 2 • The optimal value for analysis of SSRes was obtained then  = 2. • The best result for the analysis of CV SSRes was obtained then  = 6.

  24. 6. Conclusion • The linear and generalized linear regressional models for the forecasting of air passengers conveyances from EU countries were considered. These models contain a big number of explanatory factors and their combinations. • For the estimation of the unknown parameters of the linear regressional models we used the standard procedures. For the estimation of unknown parameters of GLM the special two-stage procedure has been elaborated. • The cross-validation approach has been taken as the main procedure for the check out the adequacy of all considered models and choosing the best model for the forecasting. • The advantage of GLM application has been shown.

  25. 7. References • Andronov A.M. etc. Forecasting of air passengers conveyances on the transport. // Transport, Moscow, 1983. (In Russian). • Butkevičius J., Vyskupaitis A. Development of passenger transportation by Lithuanian sea transport. // In Proceedings of International Conference RelStat’04, Transport and Telecommunication, Vol.6. N 2, 2005. • Hardle W., Muller M., Sperlich S., Werwatz A. Nonparametric and Semiparametric Models. Springer, Berlin, 2004. • Hünt U. Forecasting of railway freight volume: approach of Estonian railway to arise efficiency. // In TRANSPORT – 2003, Vol. XXVIII, No 6, pp. 255-258. • Šliupas T. Annual average daily traffic forecasting using different techniques. // In TRANSPORT – 2006, Vol. XXI, No 1, pp. 38-43. • EUROSTAT YEARBOOK 2005. The statistical guide to Europe. Data 1993–2004. EU, EuroSTAT, 2005.URL: http://epp.eurostat.ec.europa.eu

  26. THANK YOU FOR YOUR ATTENTION

More Related