180 likes | 191 Views
Learn about the Bangladesh Rural Investment Climate Assessment 2007, sampling strategies, and the importance of stratification in survey design. Discover how rural-urban linkages and service sectors impact enterprise performance and policy actions.
E N D
Sampling Stratification in Practice:An Example from a Rural Investment Climate Survey in Bangladesh. Mikhail Bontch-Osmolovski, DECRG 29 January, 2008
The Bangladesh Rural Investment Climate Assessment, 2007 • Objectives • Measure investment climate conditions • Analyze enterprise performance and start-up decisions • Identify and prioritize areas of policy actions in support of a stronger private sector • Innovations in the 2007 ICA for Bangladesh • Focus on rural-urban linkages • Secondary towns • Service sector
Rural sample Urban Sample • City corporations of Dhaka, Chittagong, Khulna, Barisal, Rajshahi and Sylhet. • Large manufacturing enterprises. • 2006 Census of large enterprises as a sampling frame • What is rural? • Informal enterprises • No sampling frame
Rural sample objectives • Non-farm enterprises in “rural” areas, but close to cities: sampling design • Manufacturing, and Trade, and Services need to stratify by sector • 90% of enterprise have less than 5 workers need to stratify by size • Also need households with no enterprises
Administrative Divisions • 6 divisions: Barisal, Chittagong, Dhaka, Khulna, Rajshahi, Sylhet • 64 districts: Zila • Sub-districts: 500 Upazila/Thana • Union/Wards • Mahallas/villages (7500 in Ec. Census)
Stylized Description Metropolitan Areas Peri-metro Area Sadar Upazilas Villages
Sampling Strategy • The strategy was to select the following random samples: • 50 mahallas located in business-intensive peri-metropolitan areas and in sadar upazilas. • 3 villages in the immediate neighborhood of each base mahalla (150 satellite villages in total.) • 2,500 non-farm enterprises in the BMs and SVs • 4 households without enterprises in each satellite village (600 households without enterprises in total).
Selecting Base Mahallas • Stratification : 50 base mahallas were allocated into 7 strata. • 2 mahallas in the Barisal peri-metropolitan area. • 5 mahallas in the Chittagong peri-metropolitan area. • 7 mahallas in the Dhaka peri-metropolitan area. • 3 mahallas in the Khulna peri-metropolitan area. • 6 mahallas in the Rajshahi peri-metropolitan area. • 2 mahallas in the Sylhet peri-metropolitan area. • 25 mahallas in the sadar upazilas • Mahallas were selected with pps from each stratum • Number of enterprises from the 2006 Business Census used as a measure of size
Selecting Satellite Villages • The sample frame for the selection of the 3 satellite villages (for each selected mahalla) is the list of all villages that satisfy the following conditions: • Be located in the same zila of the base mahalla; • Be accessible from the base mahalla in one hour or less, using the most common mean of public transportation; • Be in 30 km radius from the base mahalla. • Be located outside the Union/Ward of the base mahalla. • 3 satellite villages selected by systematic, equal probability sampling from list of all “qualifying” villages. Implicit stratification by travel time.
Selecting Satellite Villages: Implementation No villages outside the upazila of selected base mahalla were present in the list of qualifying satellite villages. I.e. the SV listing rule was modified by data collection firm.
Selecting Enterprises • The listing exercise: get employment and sector information for each address. • Stratification decision: • By sector:1500 man., 500 trade, 500 services. • By size: P=1. for 10+ enterprises. P ~ Employment for small enterprises.
Selecting Households • No stratification • Select 4 households with no enterprises in each of 150 satellite villages. • Equal probability of selection within the villages.
Calculating weights: households • P(H) = P(H|SV)*P(SV) • P(H|SV) = 4/N_hh • P(SV) = P(BM)*3/N_sv • P(BM) we know from the first stage. • W_01=1/P(H) ????
Calculating weights: households WRONG! With those weights we got estimate of 60 million households in our non-metro sample > total number of households in Bangladesh.
Calculating weights: households • Problem: Areas of eligible villages may intersect for two base mahallas, so satellite village could be selected into sample in more than one way. That is why actual probability of selecting SV is lower, and weights are too high. • For each selected SV, need to calculate probability of selection – how many BM were in the neighborhood.
Calculating weights: households • Assume, that once select BM, all the satellite villages in the upazila are eligible. Upazilas are small. Each base mahalla in upazila would have the same list of villages. • P(SV) = P(Upazila on 1st stage)*3/N_sv P(Upazila) – proportional to the “size” of the upazila
Calculating weights: households • P(Upazila) = K*(Nent_Upaz)/Nent_stratum K=number of base mahallas to be selected:2,5,7,3,6,2,25 P>1 ? Upazila is selected with certainty. New weights: 1.7 mln households.
Probability of enterprises Enterprises were not selected within village. So formula P(Ent)=P(Ent|SV)*P(SV) is an approximation. Formally: P(ent i) = 1500*n_i/(N_man) N_man comes from the first stage, so it could be different, if some other satellite villages were selected. Then P(ent) would be different. Assuming total employment in selected villages is constant: P(ent) = P(ent|Sv_1)*P(Sv_1)=1500*n_i/(N_man)*P(SV)