1 / 14

Imputation of agricultural production in South Africa’s Census 2002

Imputation of agricultural production in South Africa’s Census 2002. Phuti Malebana July 2008 Statistics South Africa. Content. Background Current situation Imputation method Results Way forward. Background.

tricia
Download Presentation

Imputation of agricultural production in South Africa’s Census 2002

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Imputation of agricultural production in South Africa’s Census 2002 Phuti Malebana July 2008 Statistics South Africa

  2. Content • Background • Current situation • Imputation method • Results • Way forward

  3. Background • Agricultural statistics is vital for the socio-economic conditions of the society • Among others, it is used by government for: - Gauge performance of the economy activity - Food security Mainly used for policy formulation • “The Food and Agricultural Organisation (FAO) (200-) defines data quality and quality of agricultural statistics as relevance, accuracy, timeliness, punctuality, accessibility, clarity, comparability, coherence, completeness, and sound metadata, and Paradata”

  4. Background • Importance of agricultural statistics for the country at large and world wide • Data should be “suitable for use” • Respondents do not provide data item or not participate at all • Non-response may diminish the representativeness of the sample and thus lead to bias • Remedies to improve the quality of data

  5. Current situation • Aggregated data • Disaggregated data • Failure to meet some of the “user needs”

  6. Imputationmethod • One of the methods for improving data quality originated and have been developed in statistical agencies • Edit/ imputation model - allows filling-in missing values or replacing contradictory values

  7. Imputation method • Yost et. Al. (2000) identify five categories of automated imputations: - Deterministic imputation - Model-based imputation - Deck imputation - Mixed imputation - Use of expert systems • Many systems make imputations based on a specified hierarchy of methods

  8. Imputationmethod • Research done so far includes use of: - Historical data - Nearest neighbor method - Average method • Historical data • Nearest neighbor method

  9. Imputation method • Average method - The mean values (R/ton, R/LSU) of the frequent (selected) products are derived using the 2002 raw data at provincial/MGD level - Using the reported VAT turnover (2002) of the non-responding enterprise, production values are derived as follows: Production value = (Turnover / R/ton or R/LSU)

  10. Imputation method • Product determination • National, provincial and MGD product distribution (farmers, production volume and production income) • The high frequent the product is (in MGD), chances are that a non-responding enterprise may be farming on that product • This is supported by the SIC • Production values are derived using average method

  11. Results • Distribution of the agric frame for census and surveys so far and common enterprises across the surveys • Correlation analysis between the Income and turnover • Ratio between Turnover, Income and Total Income • Movement of income, total income and turnover (% change across the surveys) • In overall, the above results are to determine the validity of the historical data to help in imputation process

  12. Challenges • Two rules of thumb(Hartley(1962,1974)): - Frame errors due to omissions and duplication can yield greater errors in data than all other sources of errors combined - Edit/imputation error can yield greater errors in data than all other sources of errors combined Historical data • Nearest neighbor method - agriculture frame includes enterprises registered under accountants/bookkeepers

  13. Way forward • Tested the average method using responded enterprises • Historical data will be our first priority, followed by the other two methods • Currently working on the imputation using the common products within a MGD • This imputation process will be implemented for 2007 census which is underway, to populate the production figure

  14. Is there a need to Impute? Many thanks to you

More Related