1 / 25

India Geospatial Forum, Gurgaon, Haryana 8 th February, 2012

Evaluating district-level income distribution for India using nighttime satellite imagery and other datasets. Tilottama Ghosh , CIRES , University of Colorado, Boulder, USA, and Indicus Analytics Private Limited, New Delhi, India Mayuri Chaturvedi , Indicus Analytics Private Limited,

tstephens
Download Presentation

India Geospatial Forum, Gurgaon, Haryana 8 th February, 2012

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Evaluating district-level income distribution for India using nighttime satellite imagery and other datasets TilottamaGhosh, CIRES, University of Colorado, Boulder, USA, and Indicus Analytics Private Limited, New Delhi, India MayuriChaturvedi, Indicus Analytics Private Limited, New Delhi, India LaveeshBhandari, Indicus Analytics Private Limited, New Delhi, India Chris Elvidge , NOAA National Geophysical Data Center (NGDC),Boulder, Colorado, USA Kim Baugh, CIRES, University of Colorado, Boulder, USA India Geospatial Forum, Gurgaon, Haryana 8th February, 2012

  2. Overview • Introduction • Research objective • Methods – data used • Analysis – • Step 1: State-level graphical analysis • Step 2: Model 1 • Step 3: Model 2 • Results • Discussion • Conclusion and Future considerations

  3. Introduction Why use nightlights to study income distribution? • Inclusive growth one of the major policy thrust areas in the current as well as next Five-Year Plan • Income distribution data not easy to come by • Limitations include: • Under-reporting, Over-reporting, Misreporting • Inappropriate sampling and/or weighting • Lack of standardization across sampling organizations • Enormous expense involved in data collection • Political and economic situations in areas inhibiting data collection • Huge time lags between collection and publication, and low frequency of data collection • Coarse spatial resolution, Modifiable Areal Unit Problem • Nightlights (NL) can help circumvent these problems

  4. Research objective Research objective • In this paper, we take a look at the relationship between night lights and Income distribution, as captured by the number of households in different income brackets. We then include other datasets to improve the estimation. • Use multinomial regression techniques to study the statistical relationship • Map the prediction errors to identify regions of maximum estimation errors • Use socio-economic insights to understand probable reasons behind the errors

  5. Methods Data used Radiance-calibrated nighttime image of India, 2004 Source: NOAA, NGDC LandScan population data, 2004 Source: Oak Ridge National Laboratory with United States Department of Energy. State and districts shapefile of India Source: Indicus Analytics Pvt. Ltd.

  6. Methods Data used • Three categories of households defined on the basis of annual household income • Upper income households (earning more than Rs 10 lakh per annum) • Middle income households (earning Rs 3-10 lakh per annum) • Lower income households (earning less than Rs 3 lakh per annum) • Sum of lights extracted for the States and the Districts • Area calculated for the districts • Total population extracted for the districts • Percentage of rural population in each district calculated from Indicus’ data repository comprising of urban, rural, and total population • Sum of lights and number of households in each income category graphed at the State level

  7. Analysis – Step 1 State-level graphical analysis Lower income households R2 =0.61

  8. Analysis – Step 1 State-level graphical analysis Middle income households R2 =0.81

  9. Analysis – Step 1 State-level graphical analysis Upper income households R2 =0.77

  10. Analysis – Step 1 State-level graphical analysis - inferences • Lights definitely have a relationship with households in different income categories, but is not able to capture the entire picture at the state-level • Examples highlight the need of analysis at a finer spatial resolution • Maharashtra and Andhra Pradesh (similar lights, dissimilar incomes) • Madhya Pradesh and Rajasthan (similar incomes, dissimilar lights) • Uttar Pradesh in the graph and in the NL Image (variegated lighting pattern) • Complex role of population is highlighted

  11. Analysis – Step 2, Developing Model 1 Model 1: Using nighttime lights and dummy variables The relationship between nighttime lights and household income suggested a logarithmic relationship

  12. Analysis – Step 2, Developing Model 1 Model 1Dummy variables were created for commercially and administratively important districts which are also high population zones

  13. Analysis – Model 1 Model 1Hypotheses of the model Contribution to Nightlights Number of households While we can have data on households in different income brackets, we can obtain information only on total sum of lights in a region Hypothesis One: NL should be more closely associated with the richer in any given region than with the poorer Hypothesis Two: NL will most likely tend to under-estimate the number of poor households and over-estimate the rich households Logarithmic multivariate regression model used for all three income categories using the same predictor variables

  14. Analysis – Model 1 Model 1Model coefficients Ln Y= α + β1(Ln X1) + β2 X2 + β3X3 + β4X4 + β5X5 * Significant at the 99% Confidence Interval, $ Significant at the 95% Confidence Interval, # Significant at the 90% Confidence Interval

  15. Analysis – Model 1 Model 1Inferences • Tightening of relationship between NL and households’ categories as the income goes up as seen in higher adjusted R2 values for middle and upper income category models • Magnitude of the coefficient for NL (β1) increases as we move from the lower to the higher income segments • Most of the predictor variables significant at the 99% level of significance • Coefficients of all dummy variables go up monotonically for higher income group • Lights are better able to estimate households in more affluent categories (Hypothesis One) • β’s consistently highest for the Metropolitan dummy followed by dummy for Suburbs of Metros for all three models

  16. Analysis – Model 1 Model 1Discussion • Error maps were created to study the pattern of relationship between nighttime lights and number of households in each income category • Under-estimation of number of households was observed in lower income category for highly populated states with over 80% rural population • Under-estimation of upper income households by NL observed in high population density states of UP, Bihar and Kerala • Under-estimation was lesser for upper- and middle-income households • Over-estimation of lower income households in border districts of Rajasthan • Over-estimation of lower income households in agriculturally rich states of Punjab, Haryana • Thus, both Hypothesis one and Hypothesis two proved to be true

  17. Analysis – Step 3, Developing Model 2 Model 2: Using nighttime lights, population density data & including another dummy variable A dummy variable created for districts with percentage of rural population greater than 80% Population density calculated at the district level

  18. Analysis – Model 2 Model 2Model coefficients Ln Y= α + β1 (Ln X1) + β2 (Ln X2) + β3X3 + β4X4 + β5X5 + β6X6+ β7X7 * Significant at the 99% Confidence Interval, $ Significant at the 95% Confidence Interval, # Significant at the 90% Confidence Interval

  19. Analysis – Model 2 Model 2Inferences • Inclusion of population density and the dummy variable of districts with rural population greater than 80%, increases the R2 for all the three income categories • Highest percentage increase (about 13%) in R2 value is seen for households in the lowest income category • Magnitude of the coefficient for NL (β1) is highest for the higher income group • Magnitude of the coefficient for population density (β2) is lowest for the higher income group • The rural population’s indicator is most significant for the lowest income group • In fact, the rural indicatoris negatively correlated with the middle and upper income households • Coefficients of all other dummy variables go up monotonically for higher income group

  20. Results Comparing error maps of Model 1 and Model 2Error maps – Lower income households Model 1 Model 2

  21. Results Comparing error maps of Model 1 and Model 2Error maps – Middle income households Model 1 Model 2

  22. Results Comparing error maps of Model 1 and Model 2Error maps – Upper income households Model 1 Model 2

  23. Discussion Discussion • Good relationship exists between nighttime lights and income distribution at the district level, with the relationship being stronger for households in the highest income category • Inclusion of population density and dummy variable for districts with rural population greater than 80% causes the greatest improvement in the estimates of the lower income households • A study of the error maps show that, in general , Model 2 expands the yellow areas in the maps (-5 to +5 % error) , which we are considering as ‘acceptable’ percentage errors, across all the income groups • High population density in urban areas, big share of rural population and presence of large expanse of cultivated areas which are not lit, lack of government provision of public amenities, presence of affluent farmers, presence of military base along border areas, are some of the characteristics noticed of districts with anomalous estimates of economic activity by nightlights

  24. Conclusion Conclusion and Future considerations • Finer spatial resolution analysis of nightlights is more effective in understanding and using this remotely sensed spatial data as a proxy of economic activity • The same holds true for spatial population data • The developed models (with further improvements) can be used to estimate households in different income categories for years when such data are not available • These models can be useful in studying income inequality. • Inclusion of data such as land use, land cover, vegetation cover, are some of the variables that can be considered for improving the model

  25. Thank You!! Questions?

More Related