220 likes | 351 Views
Comparative Living Standards Project. Kinnon Scott Diane Steele DECPI, April 27, 2010. Two Products. Meta Data Describing Content of LSMS Surveys Comparative Data Base of LSMS actual data (variables/indicators). Why?. Increase the use of LSMS data Meet expressed demand from
E N D
Comparative LivingStandards Project Kinnon Scott Diane Steele DECPI, April 27, 2010
Two Products • Meta Data Describing Content of LSMS Surveys • Comparative Data Base of LSMS actual data (variables/indicators)
Why? • Increase the use of LSMS data • Meet expressed demand from • Existing users • Potential users
What are LSMS surveys? • Multi-topic Household Surveys • Relationships between/among topics • Strong money-metric welfare measure • Demand driven • relevant to a country at given time (comparability issue) • Coverage has large gaps • Timing is not consistent • Designed for policy analysis and research
Getting Data Used • Document and archive the 60+ LSMS survey data bases • Improvements in data access policies/agreements • Provide data and documentation to researchers • Each data set has • Data set (3 formats) • Basic information document • Questionnaire • Additional Documentation • All in electronic format (and hardcopy) • In-country activities (collaboration,training)
Key problems in further dissemination/use of data • 1. No easy way to determine the content of all the surveys • 2. Not accessible to non-specialists (trained in micro-data analysis) • 3. Start up costs for doing cross-country analysis So how to meet the needs of these users, researchers and non-researchers?
Problem 1: • Researchers need to know which surveys have the topics they need • There is no source for this • Need to go through all questionnaires (or consult ‘institutional memory’
Solution 1: Meta Data of LSMS Surveys • Create web-based tool containing meta data describing the contents of existing LSMS data sets • Searchable Data Base • Update continually • May need to add new details (LSMS-ISA)
Key Decisions: Content • Topics to include • Identify the universe • Level of disaggregation • Module (Education) • Submodule (preschool, general, training) • Topics (preschool costs, type, distance) • Variables (cost of supplies, cost of transport, cost of food) • Interlinking • (ed->level->costs) vs. (exp.->educationlevel
Key Decisions: Search Results • Actual question vs Questionnaire? • Depends on purpose • ADP, IHSN question banks • Consistency in survey design • Questionnaire development • LSMS- research data sets • Context matters • Need to know respondent, ages, additional information
Development Path • Drafted list of topics (subtopics) • Created first web interface • Tested • Substantially revised the interface • Revised and expanded the list of topics • ‘Populated’ data base
Problem 2: • Many potential users do not have skills to analyze micro-data • Many potential users do not have time to analyze multiple data bases • Under-utilization of the data
Solution 2: Comparative Data Base (CLSP) • Database of a subset of variables/indicators from LSMS Surveys • Focus is on comparability across countries • Detailed documentation • Allow ‘on-the-fly’ tables/statistics within and among countries • Respecting sampling (weights, representat.) • Respecting confidentiality
Key Decisions: Content • List of variables • Needs vs Comparability • Present vsFuture • Define ‘Comparable’ • Standard Definitions for Indicators • When not to include a survey (100% of all variables, 80%, 10%?) • Test set of data- (issues in certain regions, multi-year surveys)
Evolution • Consumption Aggregates • Best possible, best comparable, existing • Completely non-intuitive to users • Requires redefinition of poverty lines • Stick with existing consumption aggregates (well documented) • Use existing poverty measures
Evolution • On-the-fly analysis • Basic statistics can be constructed by user • Need for advanced statistical ability • Using public domain statistical software- all on our server (Qinghua Zhao’ adaptation of R) • Need for very straightforward abilities • Created some ‘canned variables’ • Commonly used/mis-used • Documentation • Tie to output
Evolution • Platform to build on: • RIGA: with FAO, collaborated in the construction of income aggregates and variables • LMD: with PREM and DEC integrating labor variables • Integrate or stand alone
Development Path • Built on • Sub-national data base • Africa Standardized files • DDP • Not interactive • Costly to user • Not maintained • Created new interface completely • Iterative process
Lessons learned • Lessons learned • Search engine for data sets very- maintaining/ updating needs to be done • Time and resources costs (LIS example) • Comparability/harmonized is easier said than done • Learning curve • Documentation of process, decisions • Funding from KCP and GAP