200 likes | 220 Views
Census Data Archiving. Experience of the Central Statistical Agency (CSA) of Ethiopia Presented on the United Nations Regional Seminar on Census Data Archiving for Africa Addis Ababa, Ethiopia 20-23 September 2011. OUTLINE. Background Information; Census Data Maintenance;
E N D
Census Data Archiving Experience of the Central Statistical Agency (CSA) of Ethiopia Presented on the United Nations Regional Seminar on Census Data Archiving for Africa Addis Ababa, Ethiopia 20-23 September 2011
OUTLINE • Background Information; • Census Data Maintenance; • Census Data Archiving; • Type of Census Data Storage Device; • Data Storage Methodology; • Procedures for Safe Guarding the Security of the Census Data; • Challenges
1.BACKGROUND INFORMATION CENSUS UNDERTAKING EXPERIENCE IN ETHIOPIA • In Ethiopia, only three National Population and Housing censuses (PHC) have been conducted ; • The First ever Population and Housing Census was conducted in May 1984 (39.9 million, excluding Eritrea); • The Second Population and Housing Census was conducted in 1994 (53.5 million); • The third/Latest PHC was conducted in May and November 2007 (73.9 million). • Millions of paper copies of the filled in questionnaires containing data for each members of the household and millions of individual records created in a soft copy format have been acquire from those censuses. • Archiving of the hard and soft copies of the census documents has begun in 1984.
1.BACKGROUND … CENSUS UNDERTAKING EXPERIENCE … • The filled in questionnaires of the previous two censuses had been archived until the beginning of enumeration of the subsequent census. • Due to the fact that the documents collected from the field were huge that require large space for storage, it was necessary to dispose of the preceding census questionnaires under secured conditions before the conduct of the next in order to get adequate space for the later. • For the latest PHC questionnaires (census 2007) are being archived in two ways; as hard copies and images, which were captured during scanning. • The hard copies of filled in questionnaires have been stored in the warehouse where as the images of the questionnaires have been stored on the servers.
1.BACKGROUND … CENSUS PROCLAMATIONS • The country has had three Census Proclamations, an independent one proclamation for each PHC. • For 1984 and 1994 PHCs, the Proclamations used to be enacted before the commencement of the preparatory activity of each respective census and had a temporary nature where as that of the third is a permanent type, which was established according to the Constitution of the country. • The enforcement of each Census Proclamation was mainly focuses on the conduct of preparatory activities, field enumeration and approval of the results of each census count. It also stipulates the establishment of Census Commission, the highest body responsible for guiding, coordinating and overseeing the over all census operations and determined the compositions of its members.
1.BACKGROUND … CENSUS PROCLAMATIONS… • The Census Proclamations defines duties and responsibilities of each entity involved in the operations such as that of the Census Commission, the Central Statistical Agency and also determine the obligations of the dwellers of the country to provide correct information, and the confidentiality of individual data, etc. • However, it doesn’t not explicitly sates the archiving of the documents.
2.CENSUS DATA MAINTENANCE DATA BACKUP POLICY • Backup is the key to recovering files in case of a disaster or lose due to different reasons. • CSA has a backup policy which is embedded in its ICT Policy . Here are some guidelines to keep in mind: • IT will replace/reinstall lost or damaged system files and standard applications to users’ hard drives. • The user should keep original operating system or application media along with licensing information; • IT performs centralized backups for systems residing on CSA’s servers. Those departments and services using these systems will have their data and files backs up routinely. ICT administers and maintains these backups. • The users are responsible to backup of the files which are stored on the computer given to them from the Agency.
2.CENSUS DATA MAINTENANCE DATA BACKUP POLICY • The user should keep all documents in a document folder for easy backup(Best Practice) • The user should backup entire documents folder to some removable media at least once a week; daily if documents are frequently created or changed. • The user should maintain at least two backup sets, alternating their use. Thus if one backup goes bad, there will be the other • Users must store their backup media in a safe place.
2.CENSUS DATA MAINTENANCE Method of Securing Data Backup • Use high quality backup media • Restrict access to backup media. Keep backup in a locked area because they may contain large amount of confidential data in a form that easily fits into the pocket or briefcase of the attacker; • Sort and label backup media appropriately • Verify the integrity of the backup media by restoring that data during a test restore. • B. Backup Type • CSA uses 3 backup types; i.e. full, incremental, differential depending on the data. • Fullis complete backup of all data in a given folder, volume, or drive. • Incremental is only files that have change in a given period, based on date/time stamp of the file. • Differentialis only files that have changed based on file size and CRC (checksum redundancy check)
2.CENSUS DATA MAINTENANCE Data Backup Procedure and Technologies • CSA has two data backup procedure. • D to D (Disk-to-Disk) Backup which is taking backup from working to a storage server and • From computer (sever) to other storage media devices like:- • CD or DVD for application software • Plug in external tape drive(size ranges 50 to 500GB) for data • Networked(using iSCSI connection) HP 1/8 G2 Tape Autoloader for data • Networked Dell PowerVault MD3000i for data Storage • In the near future (in three months) CSA has planned to establish a data backup infrastructure which is a part of improving ICT system infrastructure.
2.CENSUS DATA MAINTENANCE Data Maintenance Usually when data is missed, corrupted, infected by virus, incorrect data size found the data are maintained by restoring from the latest backup using a disaster recovery system.
2.CENSUS DATA MAINTENANCE Data Maintenance Usually when data is missed, corrupted, infected by virus, incorrect data size found the data are maintained by restoring from the latest backup using a disaster recovery system.
2.CENSUS DATA ARCHIVING Directive for Census Data Archiving • The CSA has a directive that has been distributed to every Directorate who responsible in conducting sample survey or census to send the clean data with a complete documentation to the Information Systems Technology Directorate, which is responsible to archive and electronically disseminate data including the metadata.
3.CENSUS DATA ARCHIVING Procedures for Archiving Census Micro-data • Collect the Micro-data from the directorate (Population Statistics); • -Create appropriate directory and file structure ; • -Archive using micro-data management toolkit of the World Bank and store in the right folder; • -Disseminate Metadata and Report on the web.
4.Type of Census Data Storage Device • The Agency has 3 storage devices: one with a capacity of6 terabyte and the remaining two with a capacity of 3 terabytes each. • Of these, census data (images and micro-data) were stored on6 terabyte server. • Moreover, the same data are also stored on tapes of 400GB on each side.
5.Data Storage Methodology • After officially dissemination of the census data all documents relevant including metadata that were collected from the Population Statistics Directorate(department) are kept in the data bank in the following structure: C:\ETH-POP-YY\ where YY stand for the year when the censes/survey Conducted Under this folder there are sub- folders \DATA \DOCS \PROGRAMMS \WORK Under the Sub-Folder DATA DATA\SPSS DATA\ASCII
5.Data Storage Methodology… Under the Sub-Folder DOCS DOCS\Report DOCS \ Questionnaires DOCS \Technical Under the Sub-Folder Program Program\All programs that help for editing, tabulation… (Plan) are contained; Under the data Sub-Folder Work Work\intermediate work done during archiving are kept.
6. Procedures for Safe Guarding the Security of the Census Data CSA ensure the security of the census data through the following measures • All staff performing data processing are required to make a statement to ensure the confidentiality of data; • All completed paper questionnaires will be processed and stored in an area designated for processing census data only. Detailed records of document movements are maintained; • All completed paper questionnaires will be destroyed after 10 years and commencement of the Census; • The reference link between record and address of units of quarters will be deleted when the data set is constructed for subsequent tabulations; and • All published tables will be scrutinized to ensure no small values appear in them for small geographical units from which personal particulars may be derived through complicated deduction.
7.Challanges • Lack off site back up in case for any damage on the Data Storage Room (s); • Uses of tapes for backup, as the tapes are some times fails, • Due to limited number of Tapes available currently only two copies are taken;