260 likes | 383 Views
Modeling and Forecasting Household and Person Level Control Input Data for Advance Travel Demand Modeling. By Xiaoyu Zhu (University of Maryland) Sabyasachee ( Sabya ) Mishra (University of Memphis) Timothy Welch (University of Maryland) Birat Pandey (Baltimore Metropolitan Council)
E N D
Modeling and Forecasting Household and Person Level Control Input Data for Advance Travel Demand Modeling By Xiaoyu Zhu(University of Maryland) Sabyasachee(Sabya) Mishra(University of Memphis) Timothy Welch (University of Maryland) BiratPandey(Baltimore Metropolitan Council) Charles Baber(Baltimore Metropolitan Council) Presentation at 14th TRB Planning Applications Conference May 6, 2013 Columbus, Ohio
Background (1) • Socio-economic and demographic Data • Crucial in transportation planning • Generally developed by local agencies • Number of variables included • Current focus • macro to micro-level, • disaggregate and activity-based • Improvements to past studies
Background (2) • Population synthesis methods • Detailed socio-demographic characteristics • households and employments • Households • # of workers, size • Cross classified by income • Similarly other variables • Employment • Vehicle ownership • Occupation
Background (3) • Limitations of controlled attributes used as input to these synthesis models: • Commonly applied at more macro levels geography • county • region • state • Forecasting poses a major problem • IPF and / or IPU are commonly used • Marginal and joint distribution accuracy is disrupted • Not enough details available for each variable for synthesis • For accurate inputs: Other approaches are needed • Considers evolution process • Macro-level: less accuracy -> data available • Micro-level: higher accuracy -> data inefficiency
Objective • To develop a meso-level modeling framework • control variables at zonal level suitable for population synthesis • captures historical population evolution trend • Variables can be estimated and forecasted, such as • housing type, • householder age, • population age, • number of vehicles, • employment type, and • workers by occupation.
Methodology Framework 1990 Census Data HH Type HH Age Person Age Employment Child Occupation Step 1: Location Input Explanatory # of Coefficient HH Size HH Inc Characteristics Variables Workers Estimation Logistic Regression Estimation Input Response Variables for 2000 HH Type HH Age Person Age Employment Child Occupation Base Year: Step 2: 2010 2020 2030 2000 Forecast Step 3: Quality Control with Validation Validate with 2010 2030 Prediction at Census County Level
Coefficient Estimation Step 1: Coefficient Estimation • Let the probability of population in each age group be wherej=1 for age 0-4; j=2for 5-14; j=3 for 15-37; j=4 for 18-24; j=5 for 25-34; j=6 for 35-44; j=7 for 45-64; j=8 for over 65. The age group j=7 for 45-64is selected as the reference category. The formulation of the model is X is the vector of explanatory input variables, which contain the major variables (1990 population by age cohort), and secondary variables like median income in 2000.
Forecasting Process (1) Step 2: Forecast • Assumption: Evolution trend from 2000 to 2030 is consistent with the trend from 1990 to 2000. • First, probability of 2010 population in each age group will be calculated using 2000 as the base year.
Forecasting Process (2) Step 2: Forecast • Then the population by each group Age10 is calculated based on the total population Pop10 in each TAZ in 2010 by the formulation • Similar to the previous step, we can calculate the probability of population by each age group in 2020. • Iteratively, the target population by each age group in 2030 can be achieved.
Validation Step 3: Validation Mean Absolute Percentage Error (MAPE) and Median Absolute Percentage Error (MedAPE) are the two indicators in the validation. • Year-2000 • The fitted value and observation for 2000 at the TAZ level • Year-2010 • The forecast result for 2010 and census in 2010 at the county level • Year-2030 • The forecast result in 2030 and projected county control totals provided by local agency (In our case study MDP)
Geography and Data Source • Area • Baltimore MPO (except Baltimore City) • 814 TAZs • Data Source • Census 1990 and 2000 • Baltimore Metropolitan Council (BMC) • Maryland Department of Planning (MDP)
Estimation Result • Sample size: 763 TAZs. • Dependent variable: Classified by 8 age cohort. • The explanatory variables include the • historical age distribution in 1990, • median income, population density, employment density and group quarter density in 2000. • Other type of variables, • distribution of household size, • income and • number of workers • not highly correlated with age distribution.
Estimation Result A zone with more children aged 0-4 in 1990 will still have more children of this age in 2000. A zone with more aged 5-14 in 1990 will still have more of the age in 15-17 in 2000.
Estimation Result (1) • A zone with more children aged 0-4 in 1990 will still have more children of this age in 2000. • TAZs with more 5-14 and 15-17 aged kids will have relatively more population in 15-17and 25-34, correspondingly. • Zones with people from 18 to 34 are still high with the same age group, because of the high mobility of the young people and some fixed educational locations, such as colleges. • The residential location of 35-44 and 65+ in 1990 are more likely to be replaced by the young population in 18-24 in 2000.
Estimation Result (2) • TAZs with high median income is more attractive • to population within age 35-44 • to household with children aged 0-14. • High housing density is more likely • to have older residents (65+) • less likely to attract household with adults in 35-44 and child in 15-17. • Young populations in • 25-34 prefer to live in the area with high employment density • the older follow opposite trend • Higher group quarter density indicates more 25-34 or 65+ persons as either college students or retirees.
Forecasting Result: Year 2010 (MAPE = 10.2%; MedAPE = 6.2%)
Forecasting: 2030 (MAPE = 16.7%; MedAPE = 12.0%)
Evolution Pattern Comparison between percentages of population in 2000 and 2030 for each TAZ (0-4)
Evolution Pattern Comparison between percentages of population in 2000 and 2030 for each TAZ (5-14)
Evolution Pattern Comparison between percentages of population in 2000 and 2030 for each TAZ (65+)
Conclusions • Framework is applied to forecast several age cohort in Baltimore • The model evaluation and validation of prediction results are reasonable • This model provides a good estimation and prediction for the age group 0-24 and 35-64, but not 25-34 and 65+ groups. • The final prediction for 2030 • has a lower estimation for population over 65+ • an overestimation for teenagers comparing with the projection data (consistent with synthesis outcome).
Future Work • To apply the framework • on other demographic variables • incorporate more land use variables into the model • The variables accuracy in 2020 and 2030 needs to be evaluated at TAZ • More land use data can be incorporated • transit service • number of schools • recreation centers • as they are highly related to planning decisions
Acknowledgement Contact Information SabyaMishra Assistant Professor Department of Civil Engineering University of Memphis, Memphis, TN Phone: 901-678-5043 E-mail: smishra3@memphis.edu