1 / 26

Measurement

Measurement. Denis Cogneau Delphine Roy. Constructing “social facts” (1).

Download Presentation

Measurement

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Measurement Denis Cogneau Delphine Roy

  2. Constructing “social facts” (1) • “Fact”: should not require an inter-subjective agreement (or a poll) to be “true” – truth is conditional to a method or procedure of data collection and aggregation: this procedure can be liked or not, anyway the procedural link between the data and the fact remains Ex.: Food price ≠ Tasty character of a given food product • “Social”: we are not talking of facts that are particular to an individual or to very small groups of individuals (even if there can be only one or zero poor, everybody can afford being poor; likewise Durkheim will not talk about suicide as a personal distress but as a widespread and regular phenomenon and as a social symptom)

  3. Constructing social facts (2) Social facts are counts, even in anthropology: “is prohibition of incest general”,… or even “this group of people has a very peculiar mythology (as ≠ from other groups)” (even most psychological facts, that must occur many times…)

  4. Constructing social facts (3) A social fact is a construct depending on a methodology of accounting that has 4 components: • What is to be measured? Definition • What is to be computed? Axiomatics • How to measure variables? Measure • How to count individuals? Sampling

  5. 1 – Definition (a) Concepts come from theory: - of growth, international economics - of development - of justice - of social integration - etc. Before to be measured, concepts should be made precise in terms of population covered and dimensions concerned: …

  6. 1- Definition (b) Theory should define the concepts to be measured: • national wealth: non-market activities, waste of natural resources? • access to goods: is it income or consumption? full time income? income per adult equivalent? • poverty: do differences between the poor matter? • income inequality: relative or absolute income differences? • health inequality: does it matter per se or only correlation with other inequalities? • unemployment or migration: voluntary vs. involuntary, transient vs. permanent • trade openness: trade intra-firms ≠ extra-firms? • education: market returns or cognitive achievements?

  7. 2 - Axiomatics Quantities observed in “nature”: price of Big Mac, quantity of rice, height stature, number of people in a place at a time… Not so “natural” but elementary: skin color, mark at an exam… Measuring concepts require an aggregation of “natural quantities”: - in the space of variables: cost of living, agricultural production, individual income… literacy… - in the space of individuals: mean income, stunting / overweight, poverty, inequality…

  8. 2- Ex.: Basic axioms of inequality Identical individuals (in needs) and heterogenous income Axiom 1: Inequality should decrease when transferring income from a rich to a poor (Varlog no; Gini yes) Axiom 2a: Inequality is unchanged if all incomes are doubled (Gini) Axiom 2b: Inequality is unchanged if a given amount is added to all incomes (Kolm; Gini index not divided by the mean)

  9. At date t; n individuals indexed by i (or j); income yi(t); mean income μy(t) • Traditional “relative” Gini: G = 1/(2n²μy(t)) Σi Σj | yi(t) – yj(t) | • “Absolute” Gini: Abs-G = 1/(2n²) Σi Σj | yi(t)– yj(t) | [ In the figure divided by 1992 median, i.e. a time-invariant factor, just for normalization]

  10. 3 – Measure How to ask questions in a census or a survey: formulation, order, items… Should one rely on self-declarations? On interviewer observation? When to use direct measurements: literacy, health? How to mimic real life: buying products to know prices, making real exams? How to reach the underground : informal trade flows, unregistered economy, capital flight? How to avoid refusals to answer?

  11. 3- Measurement errors “Classical”: white noise: Y = Y* + u E(u)=0 E(uY*)=Cov(u,Y*)=0 • Increases variance and decreases correlations: V(Y)=V(Y*)+V(u)=V(Y*)(1+θ), θ>0 Corr(Y,X)=Corr(Y*,X).[1/(1+θ)] • Hence, enhances the impression of inequality, mobility, or multidimensionality. Computations of the sensitivity of indexes to measurement errors may help (assume θ=20% and see what comes out) Non-classical: correlated with the true value Bounded variables like dummies: negative correlation Y and Y*=0 or 1 u=-1, 0 or 1 Comparisons between direct data and self-declared data on income: θ around 20-30% on US data (Handbook of Econometrics, vol.5, measurement errors in survey data)

  12. V(Y)=V(Y*)[1+V(u)/V(Y*)] = V(Y*) (1+θ) X correlated with Y* but not with u: Cov(X,Y)=Cov(X,Y*)+Cov(X,u)=Cov(X,Y*)+0 Corr(X,Y)=Cov(X,Y)/(V(X)V(Y))1/2=Corr(X,Y*)/(1+θ) If Y and Y* are dummies: E(u)=0 imposes same number for u=-1 and u=+1, i.e. n01 Cov(Y*,u)=- n01/n <0 Var(u)=2n01/n Var(Y*)=(n01+n10)/n Etc. (same thing when Y & Y* bounded) Let ρ=-corr(Y*,u)>0, V(Y)=V(Y*)[1+θ-2ρV(u)]>V(Y*) iff V(Y*)<1/2ρ Etc.

  13. 4 – Sampling How to obtain an unbiased and precise image of the desired population (of products, of firms, of people)? - avoiding selection or attrition • minimizing confidence intervals • with a given hierarchical structure: clusters, networks, biographies…  Sample theory as a branch of statistics

  14. 4 –The revolution of probabilistic samples Law of large numbers : n independent and identically distributed (i.i.d.) random variables Xn, with finite expected value E(Xn)=μ and finite variance V(Xn)=σ². Let Mn=ΣnXn/n. (empirical mean). X can be either discrete (frequencies) or continuous. Weak version: For all ε>0, limn+∞ P( |Mn-μ| ≥ε) = 0 i.e. the distribution of Mn concentrates around μ. Comes from Bienaymé-Chebyshev inequality: P(|Mn-μ| ≥ε)< σ²/nε² Coin tossing: you may draw samples with many heads or tails but the relative share of these samples decreases as n increases Strong version: the probability of samples ωthat are systematically drawn away from μ becomes negligible: P(ω | limn+∞ Mn(ω)=μ)=1 Coin tossing: drawing samples whose empirical mean does not converge to ½ become less and less probable as n increases

  15. Bienaymé-Chebyshev: Proof Case X discrete random variable, m(x) distribution function: P(|X-μ| ≥ε) = Σ|x-μ| ≥ε m(x) V(X)= Σx (x-μ)²m(x) ≥ Σ|x-μ| ≥ε(x-μ)²m(x) Σ|x-μ| ≥ε(x-μ)²m(x) ≥ Σ|x-μ| ≥εε²m(x) = ε² Σ|x-μ| ≥εm(x) = ε² P(|X-μ| ≥ε) So: P(|X-μ| ≥ε) < V(X)/ ε² X continuous: same proof with density function f(x) instead of m(x) and integrals ∫ instead of sums Σ

  16. Weak law of large numbers Xn i.i.d.: V(X1+…Xn)= nσ²  V(Mn)=σ²/n E(Mn)=μ Then, applying Bienaymé-Chebyshev to Mn: P(|Mn-μ| ≥ε)< σ²/nε² Without replacement, a little bit more complicated calculations give: V(Mn)=[(N-n)/(N-1)]σ²/n

  17. Poll survey: n = 1000 respondents; Obama = Mn(ω) = 54%; true % p, unknown P(|Mn-p| ≥ t) ≤ p(1-p)/nt²  P(Mn-t<p<Mn+t) ≥1-1/4nt² Mn-t < p < Mn+t : confidence interval at level α=1-1/4nt² To have P ≥α=0.95 the minimum t is t=0.07, so 47% < p < 61% Does not depend much on the sampling rate: modify formulas of variances by a (N-n)/(N-1) factor (sampling without replacement)

  18. Samples without replacement are the real world samples. Formulas should hence be modified a little to take into account the violation of the independence hypothesis The change is very simple: multiply variances by (1-n/N), and usually n/N is very small so that nothing changes • The sample size n is (usually) more important than the sample rate n/N • However precision only increases with √n (root-n samples) : So that to double precision you have to increase sample size four times

  19. Back to concepts Regarding society (or nature), a concept that can not be measured is like a theoretical proposition that can not be empirically identified through a statistical analysis: they are empty How long should a concept or a theory survive without empirical contents?

  20. Facts and counterfactuals (1) A social fact is an aggregation of natural quantities grounded on theory. Many questions are factual. Other questions are counterfactual. The difference can be large: • What are the differences between Côte d’Ivoire and Ghana, in terms of economic and social structures? • How would have looked Côte d’Ivoire if it had been colonized by the British instead of the French?

  21. Facts and counterfactuals (2) Or the difference can look thinner: • Do richer people have taller children? (Not much) • When income increases, do children grow taller as a consequence? (Yes indeed) - Do income differentials between social origins represent a large share of total income inequality in Brazil? • What income inequality would be observed if social mobility was maximal?

  22. Facts and counterfactuals (3) Facts are often too quickly interpreted as counterfactuals, especially when they produce a rich description, for instance through decomposition techniques (for growth, inequality, mobility, etc.) Facts rule out some theories that can not account for them; but they are compatible with many others.

  23. A very brief history of stat(e)-istics • Land property (writing), then cadastre • Fiscal income based on population (censuses), trade (customs), production (agricultural censuses in Rome)… • (Control of) Prices & wages (Diocletian edict) • National accounts (Quesnay) • Probability theory and sample surveys • Internet data

  24. Where to get some data on the Web? More and more data, macro and micro, becomes freely available on the Web. Suggestions will be given within each session. • UN, IMF, WB, WTO, ILO, Eurostat… • Pop. censuses: IPUMS initiative • Survey series: DHS, LSMS (WB), General Values Surveys, Rand Corporation surveys…

  25. Some free micro data on the Web IPUMS international (census data): https://international.ipums.org/international/ Demographic and Health Surveys: http://www.measuredhs.com/ World Values Surveys: http://www.worldvaluessurvey.org/ Rand Corporation surveys: http://www.rand.org/labor/FLS/ Some LSMS surveys (World Bank): http://iresearch.worldbank.org/lsms/lsmssurveyFinder.htm African Poverty Databank (World Bank): http://www4.worldbank.org/afr/poverty/databank/cdroms/default.cfm Other free data on WebPages of academics who produced data…

More Related