140 likes | 151 Views
Explore the methodology and challenges faced during the processing of Census 2009 data, along with proposed solutions. Learn about the steps involved in error detection, scanning, verification, coding, and checks. Discover the need for manual intervention and quality control measures to improve data accuracy. Find out about the planned steps to enhance the data processing workflow and adapt to new technologies.
E N D
HOPS MEETING - PRESENTATION VNSO – Census 2009 DATA PROCESSING
Background Procedural method being used during the data processing of the Census 2009 enumeration forms Problems encountered during the exercise Proposed procedure we could have taken to avoid some of the questionable data caps.
Census 2009 Structure Data processing Mechanism for detecting errors on forms
Data Processing Procedural Steps • All enumeration forms were send back in envelopes where one or two envelopes contain the forms per EAs. • The envelopes were registered manually as being received. • The forms were then sorted in the order of EA, EA Split and HH numbers and the form count. • The Industry & Occupation fields were manually coded. • A Control form per EA was filled in and the batch is ready to be scanned.
Scanning • Scanning is done in sets • Every third form that goes through the scanner is checked for errors • If no error is encountered then the set is saved otherwise rescanned.
Verification • All scanned forms are interpreted by the Forms Interpreter Module. • The interpreted forms are held by the SQL db ready for verification. • Verification is done over the form image being scanned.
MS Access – Coding / Checks • After the Scanning & verification the data is transferred out of Forms using the Transfer module creating text files. • The text files are imported into an MS Access to do coding of open-ended questions and making a few more basic checks. • Export script is created in MS Access to export edited data from MS Access to each EA text file for CSPro batch editing.
Problems we encountered • The major problem we encountered was the poor quality (in terms of clarity) of pencil marks & writing on the form. Good quality Poor quality
Problems we encountered Cont… • Since Vanuatu is a bilingual language country, style of writing and spelling becomes a great challenge as well. • We were over confident with the OCR system thus ignored developing control systems (field/office) to double check the accuracy of data produced by the OCR system. • A control system was developed late thus it was not fully populated with the necessary enumeration statistics.
Proposed Steps we could’ve taken Receive forms from field Register forms (EA) as being RECEIVED Sort the forms by EA, EA_Split, HH_ID, GPS_No, and check geo-ids are unique Register num forms, EA, num HH, num people. Manual Coding of the open-ended questions. Update control system (EA CODED) Scan EA Update control system (EA SCANNED) EA verified (Verifier forced to confirm all ID fields) Update control system (EA VERIFIED) Transfer EA and check that all HH are transferred Update system (EA TRANSFERRED) NB: Do a weekly/fortnightly transfer & check
Manual vs Automated VNSO is finishing off data checks and should commence batch editing in about a week.
Topic of Discussions SPC along with the countries adopting the new technology to define a more clear standard procedure should we wish to advance into OCR technology. Remote support is readily available from SPC to assist countries using the OCR system.