190 likes | 338 Views
Archiving Census Documentation and Microdata: Preserving Memory, Increasing Stakeholders * * * Wendy L. Thomas and Robert McCaa Minnesota Population Center http://www.ipums.org/International IPUMS International, funded by The National Science Foundation of the United States.
Archiving Census Documentation and Microdata:Preserving Memory, Increasing Stakeholders* * *Wendy L. Thomas and Robert McCaaMinnesota Population Centerhttp://www.ipums.org/InternationalIPUMS International, funded byThe National Science Foundation of the United States Census 2000 symposium, session 4 paper 26
Subtext: Preserving census metadata and microdataenhances value of census and increases stakeholders Microcomputer revolution --> new uses for census data, specifically microdata Effective use or microdata requires systematic preservation of metadata Availability of microdata --> enhances the value of censuses and increases stakeholders IPUMS International consortium promotes preservation and use of census microdata Census 2000 symposium, session 4 paper 26
16th century Aztec census (in Nahuatl, 1530s): “Here is the home of...” (from Museum of Antropology, Mexico City) original ms. transcribed translated digitized Census 2000 symposium, session 4 paper 26
Census microdata of the 21st (and late 20th) century: Who will preserve them?Will they be made usable? Person number Age Sex Census microdata: 12100102600700720000011210000104 22200202600700720000011210000104 32300100600700720000012123000000 42300200400700000000000000000000 52300200200700000000000000000000 62300200000700000000000000000000 Censuses are costly Public goods should be used Where microdata are available, they are used Census 2000 symposium, session 4 paper 26
…official statistics that meet the test of practical utility are to be compiled and made available on an impartial basis by official statistical agencies to honor citizens’ entitlement to public information.-- UN Statistical Commission, 1994 Census 2000 symposium, session 4 paper 26
How anonymized census samples became a standard statistical product: • USA: 1960, 1970, 1980, 1990: varying densities; gaining on CPS as most widely used demographic microdata • Canada: - 1971, 1976, 1981, 1986, 1991, 1996: varying designs - 1996: Data Liberation Initiative led to an explosion in of usage in research and teaching • UK: - 1991: 2% individuals, 0.5% householdshundreds of publications, thousands of users - 2001: double the densities. Census 2000 symposium, session 4 paper 26
IPUMSi helps five ways: • 1. Inventory the world’s census microdata • 2. Preserve endangered microdata and documentation * * * • 3. Anonymize census microdata to preserve statistical confidentiality, using highest standards (Stat. Nether.) • 4. Integrate datasets of selected countries using UN, Eurostat and other standards • 5. Disseminate database free with complete copies to all partners Integrated Public Use Microdata Series - International Census 2000 symposium, session 4 paper 26
IPUMSi National experts in each country are contracted to: PAYS Assemble microdata and documentation Develop samples to minimize confidentiality risks and maximize robustness Design national integration plancensus-by-censusconcept-by-conceptcode-by-code Write integrated documentation Census 2000 symposium, session 4 paper 26
IPUMSi INVENTORIES • Microdata...for any population or administrative division: Nation, province, district, city, ethnic group, etc. • Example: Latin America, - 20 countries- 67 censuses inventoried- 1% - 100% sample densities- 100,000 to 150 million cases19th century: 2 censuses1960s: 14 1970s: 17 1980s: 16 1990s: 17 • Found: complete census data for Colombia 1973 and 16 other countries Census 2000 symposium, session 4 paper 26
IPUMSi PRESERVES UN Demographic Center for Latin America (CELADE, Santiago, Chile)~3000 microdata tapes to be preserved and metadata (documentation) Census 2000 symposium, session 4 paper 26
Preserve against accident, deterioration and technological obsolescence • Microdata: - transfer to stable media - use standard data storage protocols - entrust copies with at least two depositories • Metadata: collect, catalogue, and reproduce - Enumeration forms (preserve all versions used) - Enumerator and data processing instructions - Codebooks (photocopies and scanned images) - Technical studies, evaluations, reports UN Stat. Div.: entire archive to be preserved, catalogued Census 2000 symposium, session 4 paper 26
IPUMSi Using the highest standards currently available:technical (Eurostat workshops)administrative (license agreement) ANONYMIZES Imagine a new statistical product: a scientifically anonymized census microdata sample made up of unidentifiable individuals... Census 2000 symposium, session 4 paper 26
Anonymized census microdata samplesavailable for European countries(* = in IPUMSi consortium, * = negotiating) • 16 countries available via PAU, 1990 round (3 in IPUMSi, 4 negotiating): • Belgium, Czech Republic, Estonia, *Finland, *Hungary, *Italy, Latvia, Lithuania, *Norway, Poland, *Spain, Sweden, Switzerland, *Russia, Turkey, *UK • 11 countries not available via PAU (2 in IPUMSi): • *Austria, Croatia, Denmark, *France, Germany, Iceland, Ireland, *Netherlands, Portugal, Slovak Republic, Slovenia Census 2000 symposium, session 4 paper 26
International Monetary Fund’s General Data Dissemination System52 countries with uniform standards • All embrace strict standards of statistical confidentiality • Prohibit disclosure of information which may identify individuals or entities • 37 of 52 countries distribute anonymized census microdata samples • Microdata samples are becoming standard statistical products Census 2000 symposium, session 4 paper 26
IPUMSi INTEGRATES Census documentation compiled for Colombian microdata Standard:UN/Eurostat Principles & Recs... Photos from Colombia integration project, February-March, 2000:4 experts from DANE (census office)+7 academics (3 universities) Census 2000 symposium, session 4 paper 26
IPUMSi International web-based access system DISSEMINATES End-User license agreement protects privacy and confidentiality assures proper use User selects countries, cases, variables, and samples--makes chronological &/or cross-national research possible using census microdata Open architecture software and mirror sites available to all partners Census 2000 symposium, session 4 paper 26
Population censuses became universal in the 20th century. Will census microdata ... in the 21st? 153 countries with 1 million + pop. in 2000 2000 round figures are provisional Census 2000 symposium, session 4 paper 26
additional information at:http://www.ipums.org/international* * * * * *Thank you Census 2000 symposium, session 4 paper 26
Preserving Memory, Increasing Stakeholders • 1. Introduction: Well-preserved documentation and data -->effective data collection, dissemination, use • 2. Long-term preservation of documentation and data • 3. Determining What to Preserve • 4. Assessing Future Value • 5. Inventory of available technology/ personnel/ knowledge • 6. Conclusion: Preserve and make accessible census microdata to enhance value of census (IPUMSi ) Census 2000 symposium, session 4 paper 26