120 likes | 133 Views
Learn about accessing individual data files in France including public use, scientific use, and secure use files, with specific topics covered in the joint UNECE/Eurostat Work Session on Statistical Data Confidentiality in Ottawa 2013.
E N D
Individual data files in France • Public Use Files • Scientific Use Files • Secure Use Files • Specific topics Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality – Ottawa 2013
Public Use Files • On Insee’s website • “Households” data • Labour force survey • Census data : 2 files • One with a localisation at regional level (27 regions in France) and detailed social variables • One with a localisation at municipality level and variables with aggregated modalities • Some register files • http://www.insee.fr/fr/bases-de-donnees/fichiers-detail.asp • In French Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality- Ottawa 2013
Scientific Use files • For researchers with specific documentation for researchers • But : • Who is a researcher ? And who is not ? • What kind of documentation did they need ? • Statisticians need some help Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality - Ottawa 2013
Scientific Use Files (2) • Réseau Quetelet : French Data Archives • Formally created in 2001 • But result of a longer cooperation between Insee and some researchers • Disseminates Insee (and other) SUF to French and foreign researchers. • Therefor determines who is a researcher or not. • Help Insee to create a documentation usable by researchers • http://www.reseau-quetelet.cnrs.fr/spip/ Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality - Ottawa 2013
Confidential Files • Long history in France for business data • Since 1984 • More recent for Household data • Since 2008 • Procedure : • Opinion by an external committee : Statistical Confidential Committee • Chaired by a judge • Participation of representatives of business unions, worker unions and researchers • Agreement of Insee • Decision by National Archives Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality - Ottawa 2013
Confidential files (2) • Longer procedure than in other countries • But probably more acceptable • 200 access requests a year • Access Through Genes’s CASD • http://www.casd.eu Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality - Ottawa 2013
How to get data ? • First stop : Réseau Quetelet • If SUF enough, get the data • Second stop : • Confidentiality Committee secretary and the data producer • To see if confidential data will solve the problem • Third stop • Confidentiality Committee • Fourth stop • CASD Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality - Ottawa 2013
Specific topics • Output checking – My OWN PERSONAL OPINION • Is it useful ? Enough ? Efficient ? • Will only cope with remote access Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality - Ottawa 2013
Output checking in remote access • Some preliminary remarks : • An output file can’t be more informative than the confidential file the researcher is allowed to browse • A researcher has already signed a confidentiality clause and could be, depending on national laws, bound by penal responsibility • A researcher could easily remember the value of some specific variable and therefore extract it from the safe centre. • Who is in charge if there’s a confidentiality break ? The NSI ? The researcher? Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality - Ottawa 2013
Output checking in remote access • OC can’t be effective : • If a researcher wants to smuggle ONE specific information outside the secure centre, NSI can’t check. He/She just has to remember it!!! • He/She could also makes specific operations to know some confidential data about a group of units. • Checking thoroughly all the output of a researcher and are sure there’s no confidentiality breach is not enough • You also have to check them with every published output made on the same data Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality - Ottawa 2013
Output checking in remote access • OC could be very expensive : • Of course, we could have the researchers paying OC, but is it a long term solution ? • Specially if 99% of researchers follow strictly confidentiality rules • OC is very dangerous for NSIs : • If an individual person or a business happens to know about some confidentiality breach, the NSI in charge of then OC could be accused and confidence could be lost • But we need to have a protection against a complete download of the data : • Look at the size of the output • Check its form Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality - Ottawa 2013