Mapping Poverty and Inequality: A Small Area Estimation Approach

Mapping Poverty and Inequality:A Small Area Estimation Approach Peter Lanjouw, DECRG, The World Bank Washington D.C., January 25, 2007

Introduction • Project within World Bank’s research department in collaboration with Chris Elbers (Free University, Amsterdam) and Jean Lanjouw (UC Berkeley) • Numerous other collaborators. • Goal is to produce (and use) estimators of welfare that are accurate and easily calculated. Termed “poverty maps”.

What are Poverty Maps? • Not necessarily “Maps”; rather, • highly disaggregated databases of welfare • Poverty • Inequality • Calorie intake • Under-nutrition • Other indicators (health outcomes? life-expectancy?) • disaggregation may, but need not, be spatial • Poverty of “statistically invisible” groups

Example: Yunnan Province (China)

County Level Poverty Incidence Estimates <2% 2%-6% 6%-12% 12%-18% 18%-30%>30% >30%

Example: Yunnan Province (China)Township-level poverty

Why is there demand for these? • Geographic targeting of anti-poverty programs • Decentralization and Public Policy (fiscal, evidence-based policy) • Political economy of local decision-making • Within-country determinants of welfare outcomes.

What is the problem? • Main source of information on distributional outcomes - household surveys - permit only limited disaggregation. • Very large data sources (e.g. census) typically collect very limited information on welfare outcomes.

What are the options? • Collect larger samples - expensive - some kind of data compromise 2. Combine the limited information available in data sources like the census, into some proxy of welfare (e.g. “basic needs index”) - ad-hoc - often widely disputed (multiple maps) - limited usually to a notion of poverty - how to interpret? (poverty=low income?)

Options, continued • Impute a preferred measure of welfare (e.g. comprehensive real consumption) from household survey into census, using statistical prediction methods Small Area Estimation • Approach developed by a DECRG team

Our Solution: Combine Census and Survey • Impute a measure of welfare from household survey into census, using statistical prediction methods. • Produces readily interpretable estimates: Works with exactly the same concept of welfare as traditional survey-based analysis. • Statistical precision can be gauged • Encouraging results to date • But, non-negligible data requirements

Methodology ELL (2002, 2003) • Estimate a model of, for example, per-capita consumption, yh, using sample survey data. • Restrict explanatory variables to those that can be linked to households in survey and census. • Estimate expected level of poverty or inequality for a target population using its census-based characteristics and the estimates from the model of y.

Three Basic Stages • Zero stage: establish comparability of data sources; identify/merge common variables; understand sampling strategy. • First stage: estimate model of consumption. • Second stage: take parameter estimates to census, predict consumption, and estimate poverty and inequality.

Methodology • Let W(m, y) be a welfare measure based on a vector of household per-capita expenditures, y, and household sizes, m. • We want to estimate W for a target population (say a village, v) where y is unknown. • Typical Estimations to date have used a log-linear model of consumption: • where ηc is a cluster random effect allowing for a locational influence on consumption.(Can be more than one level.)

Estimation Details • Estimate separate regressions per stratum • Use cluster weights where significant • Allow for non-normality of disturbances (parametric/non-parametric), and • Heteroskedasticity in individual-specific component of disturbances. Logistic model of the variance of εch conditional on zch, bounding the prediction between zero and a maximum, A, set equal to (1.05)*max{ech²}:

Simulation of Welfare Measures We are interested in estimating W(m, Z, ζ, u) where m, Z, and u are conformable arrays of household size, observables and disturbances. The expected value of W is: • where ζ is the vector of model parameters • We replace the unknown vector ζ with consistent • estimators from the first stage, and use simulation to obtain our estimator,

Prediction Error The error in the estimator can be decomposed as: • Idiosyncratic error – increases with smaller populations. • Model error – not related to size of target population. • Other elements can include: • Computation error – part of model error, can be negligible. • Sample error – when large dataset is also a sample vs. census.

Albania Poverty Map

Validation • In Mexico, PROGRESA is a rural health, education and nutrition project. • As part of the PROGRESA evaluation, a census was taken of all households in 496 rural communities in 1998. • 20,544 households were covered • This census included detailed information on consumption expenditures.

Testing the Poverty Mapping Methodology • A random sample of 50 communities was drawn from the full 496 communities. • From each of the 50 communities, 10 households were randomly sampled. • The data from these households serve as a “pseudo” survey.

Testing the Poverty Mapping Methodology • 20 “target populations” were constructed from the full set of 496 localities, by grouping together 24-25 localities at a time. • Each target population covers about 1000 households. • Central Question: How well does poverty mapping methodology applied to the “pseudo survey” predict poverty in the target populations?

Precision of estimates

Caveats • Mexican localities do not vary much in terms of poverty. • Haven’t been able to test the methodology in a setting where localities vary dramatically from each other. • Test is almost a laboratory set-up: • Survey and census refer explicitly to exactly the same time period • Variables are fully comparable, by construction. • The fact that results are good in Mexico does not mean they will be good everywhere. • Mexican experience shows that method matters.

Countries Being Mapped • Latin America: Mexico, Guatemala, Honduras, Nicaragua, Panama, Ecuador, Bolivia • Asia: China, Indonesia, Papua New Guinea, Laos, Cambodia, Vietnam, Thailand, Philippines, Bangladesh, Sri Lanka, India. • Africa: South Africa, Mozambique, Madagascar, Kenya, Uganda, Zambia, Malawi, Morocco • Eastern Europe and FSU: Albania, Azerbaijan, Bulgaria, Kazakhstan

What have we learned so far? • The exercise itself can have a huge impact on the debate on poverty within a country and the Bank’s relationship with the client government (e.g. Kenya, Morocco, among many others) • Methods borrowed from ‘poverty mapping’ can improve the quality of the debate and inform policy.

What have we learned so far? • There should be no presumption that small, poor, rural villages are homogenous in terms of income or consumption

What have we learned so far? • Geographic targeting of transfers can generate very large gains for poverty reduction

What have we learned so far? • Inequality is negatively correlated with many social outcomes we care about. • It can also influence the functioning of projects at the local level.

Research and Future Directions: • Methodology • Mapping other indicators (e.g. nutrition) • Survey to Survey • Updating Poverty maps • Research • Should we target using maps? How? • “Impact” Mapping (ex-ante policy impact analysis) • Small area analysis of poverty, inequality and: • Growth • Environment • Crime • Health • Infrastructure • City size • Electoral Participation

Cambodia Stunting Map

Poverty and Stunting in Cambodia

Survey to Survey Application: Introduction • India in the 1990s: Acceleration of economic growth • Economic reforms and liberalization • What has happened to Poverty? • Has poverty reduction accelerated?

Survey to Survey cont. • Poverty Trends • Analysis has traditionally been based on National Sample Survey (NSS) data. • Two “thick” rounds in the 1990s: 1993/4 (50th) and 1999/0 (55th) • “Thick” rounds are designed to be representative at the state and even, sub-state, level (NSS-regions).

Survey to Survey, cont. • At first glance analysis of NSS data suggest impressive declines in poverty: • “Official” headcounts (published by Planning Commission) 1987/8 1993/4 1999/0 • Rural India: 39.4 37.3 27.1 • Urban India: 39.1 32.4 23.6 • These trends suggest an acceleration of poverty decline

Survey to Survey, cont. • Are these estimates credible? • Serious concern regarding comparability of the 50th and 55th round NSS consumption data • 55th round applies different recall periods across expenditure items, while 50th applies only one (30 days). • 7-day recall period for frequency items like food, 365-day recall period for low-frequency items • Households might try to reconcile their answer to questions that refer to different recall periods, it is likely to boost the expenditure estimates based on 30-day recall data.

Survey to Survey : Methodology • Based on Elbers, Lanjouw, Lanjouw (2001, 2003) We estimate in the 50th round: ln yh = xhβ + εh yh: per capita exp for household h, x: 30-day exp, or a set of observed household characteristics, ε: disturbance term • Model is estimated with weights. • We estimate separate regressions per state and again per region. • We allow for: • intra-cluster correlation in disturbances • heteroskedasticity in individual-specific component of disturbances. • non-normal and non-parametric disturbances • report estimated standard errors alongside point estimates.

Adjustment Methodologies • Deaton and Dreze (2002)identify one consumption component in 55th round questionnaire which has not been altered. • 30-day “intermediate goods” consumption • Comprises fuel and light, rent, non-institutional medical miscellaneous goods and services. • Assess the relationship between 30-day comparable consumption and probability of being poor in 50th round, and then predict 55th round headcounts • Key Assumptions: • (1) reported 30-day expenditure are unaffected by the changes in questionnaire elsewhere • (2) relationship between 30-day consumption and full consumption is much the same in 55th as in 50th round.

Adjustment Methodologies • Deaton and Dreze find a slightly less rapid decline in rural poverty during the 90s. • Kijima and Lanjouw worry about some of the regional patterns emerging from Deaton and Dreze results. • Is it possible that “comparable” 30-day expenditure has been contaminated by questionnaire changes elsewhere? • Does the stability assumption underpinning D&D fail to hold in places?

Adjustment Methodologies • Kijima and Lanjouw propose another specification: • model replaces 30-day intermediate goods consumption with a set of other household variables that have clearly not been redefined between 50th and 55th rounds: “multivariate model” • Assumption: relationship between household’s characteristics and log per capita expenditure is stable in each region between 1993/4 and 1999/0 • How reasonable? • Note: we side-step inflation adjustments

Selected Region-Level Results Rural 50th 55th (single)55th (multiple) Andhra South 22% 32% 22% (3.1) Bihar Southern 53% 44% 46% (2.5) Gujarat Dry Areas 38% 23% 35% (4.9) Karnataka Eastern 22% 6% 15% (3.3) Karnataka Southern 40% 22% 30% (2.9) MP Central 46% 20% 37% (4.7) MP Western 65% 52% 66% (3.7) Maharashtra North 53% 38% 47% (3.7) Tamil Nadu Southern 42% 21% 37% (2.6) UP Southern: 51% 16% 45% (5.8) • Note: Standard errors are not small • Partly sampling, and partly prediction error

Adjusted Rural Poverty Estimates % point decline 50th -55th Rounds Rural India UNADJUSTED 10 Sundaram and Tendulkar (2003a) 9 Deaton and Dreze (2002) 7 Sundaram and Tendulkar (2003b) 5 Datt, Kozel, and Ravallion (2003) 4 Sen and Himanshu (2004) 3 Kijima and Lanjouw (2003) 3 Kijima and Lanjouw (2005) 2 • Note: different poverty lines! • Rate of poverty decline depends on adjustment method

Concluding remarks • “Poverty Mapping” has been widely implemented. • As “snap shots” become more common, demand is shifting increasingly towards updates. • Poverty monitoring at the local level is a, key, policy imperative in many countries. • Critical step is to validate methods outlined here

Mapping Poverty and Inequality: A Small Area Estimation Approach