250 likes | 476 Views
DATA CAPTURE – PROCESSING 2006 POPULATION & HOUSING CENSUS OF NIGERIA Presented at UN Regional Workshop on Census Data Processing. By Adesola Fatilewa NATIONAL POPULATION COMMISSION At Dar-es-Salaam, Tanzania 9 th -13 th June 2008.
E N D
DATA CAPTURE – PROCESSING 2006 POPULATION & HOUSING CENSUSOFNIGERIAPresented at UN Regional Workshop on Census Data Processing By Adesola Fatilewa NATIONAL POPULATION COMMISSION At Dar-es-Salaam, Tanzania 9th -13th June 2008
ABOUT NIGERIA • NIGERIA IS THE MOST POPULATED COUNTRY ON THE AFRICAN CONTINENT AND THE 10th BIGGEST IN THE WORLD. • AN AREA OF ABOUT 9.28 MILLION SQ. KMS. • POPULATION OF 140.2million BY 2006 CENSUS • COMPRISES OF 36 STATES AND FEDERAL CAPITAL TERRITORY • 774 LOCAL GOVERNMENT AREAS - LGA (DISTRICTS) • DELINEATED INTO OVER 662,000 ENUMERATION AREAS
Background • Since the late nineties NPopC was being inundated with proposals on various document scanning systems. • As at 2005, statements were being made,suggesting that the idea of using scanning technology was utopia.
Processing Pre-test and Trial Census • A scanning system was used to process the second pre-test of April 2004. • Number of documents processed was about of 100,000 forms as survey covered one local government area (Lga) in each of the 36 States of the country and the Federal Capital Territory. • The forms were only optical mark readable and editing was mainly to correct alignment errors.
Processing Pre-test and Trial Census Continued • Another solution provider supplied five scanners along with two servers for the processing of the Trial Census. • Trial Census which took place in April 2005 covered about 5% of the country, which translated to about 10million population. • Processing was distributed between two data processing centres (DPCs); Lagos and Kano
Lessons learnt • staff were identified for suitable roles in data processing of the main census • staff gained experience on the new technology • alignment and recognition problems detected and rectified • decision taken on appropriate archiving system for storage and retrieval of documents • need to have various reports to enable management follow progress of processing • decision to completely eliminate manual coding and editing
Data capture 2006 census • Scanning technology was fully deployed in processing Nigeria 2006 Population and Housing Census. • This was achieved with 21 scanners distributed in 7DPCs located strategically across the country. • Immediately after the census, OMR/ICR forms (questionnaires) used to collect data started arriving at the DPCs. • Inventory control was done using an EA tracking system
Data capture 2006 census • Documents were enveloped by EA, tied in convenient batches and stacked on labelled shelves • At the end of the receiving/archiving exercise, batches were retrieved for data capture
Paper Preparation before Scanning Envelope Envelope BatchHeader BatchHeader ARCHIVE NPC0x cut the paper with cutting machine Otherwise:paper damaged, introduce dirt on the scanned image, reject increased NPC0x Bring the envelopes with the questionnaires from the Archive room Remove the envelopes STORE IN Program Print and add Batch Header Jog the paper with the supplied jogger
Server Jog Docs Scanner Edit Stations Data Processing Steps at DPCs • Schematic diagram
Scanner Views Scanner Feeder Questionnaire processing
Scanning • Sheets loaded on the feeder in batches separated by batch header went through transport system of scanners HR80 SC • Scanner speed was 8000 sheets/hr barring jams and other loading difficulties. • Scanning was effected by ProScan software and scanned documents were collected at the output tray. • The sheets were returned into their envelopes and sent back to archive
SC80HC + ProSort + kEOPs 5. kEOPs recognition Work Data Storage 4. Data + Images 2.Scanner CS Pro 8. DVD 7. Export Archive Data Storage 3. Paper Archive Batch Header NPC0x HQ 9. TAPE MANUAL WORK 1. Preparation for Scanning: cut & jogg Carto 6. Correction Balancing 8. Local reports
Editing • Two levels of Editing: • First level at DPC • Second level at DVU at NPopC hq. in Abuja
XML format stored in SAN on servers networked to scanners Forms in XML loaded onto edit stations The editing system used was called KEOPs and it was designed to check geographic ids against the batch headers, check ‘mandatory fields’ Transactions or whole batches could be passed for ‘balancing correction level’ which was handled by more experienced staff designated ‘Supervisor’, First Level Editing
Second Level Editing • Data in ASCII,was encrypted, backup on cds at the DPC and sent to NPopC Hq., Abuja • Data is decrypted, validated, collated and further edited at Abuja • Data is then checked for completeness to ensure that each delineated EA for any local government had data associated with it • CsPro package was then utilized to edit data and aggregate appropriately
Second Level Editing Continued • Structure checks • Range checks • Skip pattern checks • Inter-record and intra-record consistency checks • Imputation methods applied for missing or invalid values: • Hot deck’ • and ‘Cold deck’ or a combination of both
Occupational Coding • The only data that was not coded on the field was occupation • The occupational coding was effected automatically using a computer-assisted coding system • ‘Exceptional Coding’ was applied where coding clerk could not find an appropriate occupation code for an occupation
Challenges • Ensuring that documents for particular geographic locations were archived in sections of the archive and shelves designated for them • That all forms were separated before taking them for scanning • Breakdown of jogger • Rate of getting documents ready for scanning was slower than rate of scanning • Difficulty in maintaining belts and fixing them over pulley • That correct batch headers were properly placed on EA batches and that after scanning, EAs were correctly returned to their marked envelopes
Challenges Continued • Instances of poor field work which resulted in ‘missing values’ of ‘mandatory fields’, outright wrong values for fields • Difficulty in linking forms for households of greater than 8 persons • Integration of the two solution providers: form design and equipment and software solutions were provided by two different companies • Cleaning of blank records of data associated with them at data capture • Dealing with sensitivity of Nigerians to census figures • lack of reliable and uninterrupted power supply
Conclusion • The Commission was proud that the decision to deploy a new technology for part of the processing of Nigeria 2006 Population and Housing Census was a success • About 35million forms were scanned and edited using 21 scanners, over 220 edit stations and data in XML format and ASCII stored in about 76TB of SANs. All scanning and first level editing was completed within nine months of enumeration period. • About 1000 Nigerians were trained and gained expertise in various aspects of the scanning technology • There is a need for intensive trainings in these areas of OMR/OCR forms design and development of appropriate scanning softwares.
End • Thank you for your attention