270 likes | 662 Views
Aggregate Governance Indicators. Aart Kraay The World Bank Presentation at World Bank Conference “The Empirics of Governance” May 1-2 2008. Why Aggregate Indicators?. synthesize information about governance from a diversity of viewpoints
E N D
Aggregate Governance Indicators Aart Kraay The World Bank Presentation at World Bank Conference “The Empirics of Governance” May 1-2 2008
Why Aggregate Indicators? • synthesize information about governance from a diversity of viewpoints • particularly useful for advocacy (look at success of TI-CPI) • achieve greater country coverage than individual indicators • enable comparisons across different sources • smooth out idiosyncracies of many individual sources • better measure of broad concepts of governance • but at the cost of specificity • generate explicit measures of the imprecision of aggregate – and individual – indicators they can (almost) always be disaggregated!
Plan for Talk • How to construct and use aggregate indicators • define what you want to measure • select sources and combine them • compute (and use!) margins of error • discuss only “multisource aggregates” • Details, details, details..... • balanced or unbalanced comparisons? • independence of errors? • relative or absolute changes? • complexity? • scaring the jello? • Summing up
Defining Topics for Aggregate Indicators • governance is hard to define sharply • but ... easy to overstate lack of definitional consensus around governance (see next slide) • some areas of governance have particularly clear and unambiguous definitions: • “Corruption is the use of public office for private gain” • lack of definitional agreement (at the margin?) should not paralyze measurement efforts • proponents of alternative definitions can feel free to construct their own indicators • are the resulting country rankings different? • what do we learn from the differences?
What do we mean by “Governance” ? • World Bank (1992): "Governance is the manner in which power is exercised in the management of a country's economic and social resources for development“ • World Bank (2007) definition: "...the manner in which public officials and institutions acquire and exercise the authority to shape public policy and provide public goods and services" • WGI Definition (1999): "...the traditions and institutions by which authority in a country is exercised. This includes the process by which governments are selected, monitored and replaced; the capacity of the government to effectively formulate and implement sound policies; and the respect of citizens and the state for the institutions that govern economic and social interactions among them."
Six Dimensions of Governance in the WGI • The process by which those in authority are selected and replaced • VOICE AND ACCOUNTABILITY • POLITICAL STABILITY & ABSENCE OF VIOLENCE/TERRORISM • The capacity of government to formulate and implement policies • GOVERNMENT EFFECTIVENESS • REGULATORY QUALITY • The respect of citizens and state for institutions that govern interactions among them • RULE OF LAW • CONTROL OF CORRUPTION
Selecting Individual Indicators For Use As Ingredients of Aggregate Indicators • view individual indicators as imperfect or noisy proxies for broader concepts of governance, e.g.: • control of corruption: proxies include: • is corruption widespread? • percent of contract value demanded in bribes? • risk that value of FDI adversely affected by bribes? • Crucial observation: proxies do not need to be perfect to be useful! Indicator = Signal + Noise • ideally want indicators with low Noise/Signal... but Noise/Signal=0 is unattainable • as long as Noise/Signal>0, indicator is useful ingredient for aggregate indicator • aggregation can be used to downweight indicators with high Noise/Signal (to come....)
More Examples of Ingredients for Aggregate Governance Indicators • Rule of Law (WGI) • enforceability of private contracts (DRI) • fairness/speed of judicial process (EIU) • confidence in police (GWP) • property rights over rural land (IFD) • many more.... • Sustainable Economic Opportunity (Ibrahim Index of African Governance) • GDP/Capita, Growth, Inflation, Budget Deficit • Days to start a business • Contract-intensive money • Road density, computer and internet density
Placing Indicators in Common Units • Trivial to rescale data to 0-1 scale • More subtle issue: how do we compare a 7/10 score in a source that covers mostly developed countries with a 7/10 score in a source that covers mostly developing countries? • Option 1: percentile matching (TI-CPI) • Source 1: A > B > C • Source 2: C > D • Aggregate: A > B > C > D • Option 2: elaboration on unobserved components model (WGI). Details in KKZ (1999). • Useful byproduct of aggregate indicators is that it allows comparisons based on dissimilar sources (in example above you can now compare country A and country D)
Weight A Minute – All Aggregate Indicators Require Decisions on Weights! • Option 1: Arbitrarily assign weights • equal weights (e.g. TI-CPI, most others) • different weights based on views of what matters more (e.g. Ibrahim Index of African Governance) • decision to exclude a source implies setting a zero weight (e.g. TI-CPI excludes all household surveys) • Option 2: Let the data choose the weights (logic of unobserved components model underlying WGI) y1=g+e1 y2=g+e2 y3=g+e3 • if CORR(y1,y2)>>COR(y1,y3), y1 and y2 are more informative about g (if errors are independent) • Option 3: Regression-based weights to capture importance of each indicator for outcomes • not widely (ever?) used • in principle is appealing, in practice virtually impossible
Does Weighting Matter? • Depends crucially on the extent to which the underlying data sources are correlated with each other • if correlations are high, weighting matters little • if correlations are low, weighting matters a lot • Example: Two robustness checks on WGI weighting scheme for Control of Corruption • Option 1: equally-weighted • Option 2: aggregate 4 types of sources (commercial, NGO, public sector, and surveys) • Very highly correlated with baseline WGI indicator • Option1: correlation with baseline = 0.998 • Option 2: correlation with baseline = 0.959 • conclude that “ingredients” of WGI-CC are quite highly correlated – so details of weighting don’t matter much
Spot the Difference: Alternative Aggregations of WGI-Control of Corruption Indicators Using Different Weights
Margins of Error • margins of error summarize the degree of disagreement across sources in their assessment of governance • two ways to construct them: • standard deviation across sources • estimate based on a structural statistical model (e.g. WGI uses unobserved components model) • precision-weighting of sources in WGI (modestly) reduces margins of error • aggregation reduces measurement error about broad concepts (smooths out idiosyncracies of individual sources) • essential to use them to assess significance of cross-country differences or changes over time
Margins of Error Decline With the Number (and Quality) of Data Sources
Good Governance Control of Corruption Selected Countries, 2006 Margins of Error Governance Level Poor Governance DISCLAIMER: The data and research reported here do not reflect the official views of the World Bank, its Executive Directors, or the countries they represent. The WGI are not used by the World Bank Group to allocate resources or for any other official purpose. Source for data: 'Governance Matters VI: Governance Indicators for 1996-2006’, by D. Kaufmann, A. Kraay and M. Mastruzzi, June 2007, www.govindicators.org. Colors are assigned according to the following criteria: Dark Red: country is in the bottom 10th percentile rank (‘governance crisis’); Light Red: between 10th and 25th percentile rank; Orange: between 25th and 50th percentile rank; Yellow, between 50th and 75th; Light Green between 75th and 90th percentile rank; and Dark Green: between 90th and 100th percentile (exemplary governance). Estimates subject to margins of error.
Margins of Error: A Little Perspective • do not confuse absence of explicit margins of error with absence of measurement error – present in all governance indicators • margins of error are not unique to subjective- or perceptions-based aggregate indicators • can infer them based on inter-correlations of any type of indicator • keep the baby, ditch the bathwater! • 2/3 of pairwise comparisons on WGI are significant (at 90% level) • 1/3 of countries show a significant (at the 90% level) change in at least one of the six WGI between 1996 and 2006
Details 1: Don’t Lose Your Balance! • comparisons of aggregate indicators across countries and over time are often “unbalanced” – different set of sources underlying the two comparators • the alternative (strictly balanced) is far too restrictive • balanced WGI-CC based on top five sources would cover just 117 countries, not 207 • much less diverse set of sources as well • “unbalancedness” is not so bad as you think! • 60% of pairwise comparisons in WGI involve 5 or more common sources • just 7% of variation in large changes due to changes in composition of sources • can always go back to the source data!
Details 2: I Think You Think I Think You Think I Think You Think Bangladesh is Corrupt • Correlated perception errors are potentially an important issue, as they could: • reduce the information content of aggregate indicators • distort weighting scheme • First-order issue: single- versus multiple-source aggregate indicators • single-source aggregates average responses of the same experts to many questions (CPIA, GII, DB, etc) • almost by definition have strongly correlated perception errors across components • multiple-source aggregates are less subject to this problem • unless perception errors perfectly correlated, still can get efficiency gains from aggregation
Evidence on Correlated Perception Errors? • easy to assert, but hard to test y1=g+e1 y2=g+e2 • all we observe is CORR(y1, y2) – is it because: • CORR(e1, e2) is high? • VAR(e1) and VAR(e1) are small? • need an identification strategy (not storytelling) • Example: expert assessments more likely to make correlated perceptions errors than survey respondents • are expert assessments more correlated with each other than with surveys? Not necessarily • average pairwise correlation of 5 expert assessments of corruption = 0.80 • average correlation of each with a firm survey = 0.82 • correlations among expert assessments don’t increase over time
Details 3: Everything is Relative .... Or Is It? • All indicators require choice of units • 0-10 (TI-CPI), 1-6 (CPI), A-D (PEFA) • WGI has particularly nerdy choice (are you surprised?): • standard normal distribution • forces world average = 0 in each period • most other indicators also implicitly make choices about averages (e.g. CPIA grade inflation) • Does this confuse relative and absolute changes? Δy(j) = Δ(y(j)-average) + Δaverage • absolute changes and relative changes coincide if no changes in the world average • look for evidence in individual sources whether world averages change – answer is a resounding “no”!
Details 4: It’s Just Too Haaaaard.....! • common critique is that aggregate indicators are too complicated and non-transparent • same is true for all kinds of things (national income accounts, PPP adjustments, poverty measures, the NFL draft, the engine under the hood of my car) • better to be complicated (and a bit closer to right) than naive (and a bit further from right)
Details 5: Scaring the Plants (and the Jello) • how big are the risks of measurement ahead of theory? • risk of inaction until we all agree on a theory is worse • how do we verify indicators (and ingredients of indicators)? • are different indicators of a core dimension of governance correlated? • exactly what unobserved components model does • are they uncorrelated with other core dimensions of governance? • pretty much a hopeless question since core dimensions of governance are correlated • more or less “free entry” in the market for indicators • more interesting to show the quantitative relevance of critiques than to simply speculate
Summing Up • aggregate indicators can (for some purposes) serve as a useful summary of large numbers of indicators • but no reason to be wedded to any particular aggregate • we learn a lot from cross-referencing alternative related indicators as part of process of building aggregates • why are they similar, why are there outliers? • formally can construct margins of error • crucial for policy dialogue (and sensible use) • lots of potential for argument over the “nitty-gritty details” • less clear that these are first-order concerns
Bottom Line • differences between: • alternative aggregates, • aggregate versus individual, • subjective vs objective • ‘actionable’ versus ‘whatever the antonym is’ are minor compared to difference between having data and not having it at all