130 likes | 331 Views
Istat’s new strategy and tools for enhancing statistical utilization of the available administrative databases Giovanna D’Angiolini Pierina De Salvo Andrea Passacantilli Speaker: Andrea Passacantilli Q2014 - Vienna, 2-5 June 2014.
E N D
Istat’s new strategy and tools for enhancing statistical utilization of the available administrative databases Giovanna D’Angiolini Pierina De Salvo Andrea Passacantilli Speaker: Andrea Passacantilli Q2014 - Vienna, 2-5 June 2014 Italian National Institute of Statistics- Vienna, 2-5 June 2014
Administrative data sources- a resource for officialstatistics • The administrative data sourcesowned by public institutions set up an important information asset for officialstatistics • By means of properlyexploitingthese data repositoriesitispossible to: • reduce the costs of producingstatistical data • reduce the need for surveys and therefore the burden on respondents (people, firms, otherorganizations) • improve the data quality, in particular the data timeliness and the coverage of the populations of interest Italian National Institute of Statistics- Vienna, 2-5 June 2014
DPR 7 september 2010, n.166: new Istat regulation ISTAT provides “for definingmethods and formats to be used by the public administration for exchanging and usingstatistical and financial information in the web, aswellas for harmonizingchanges, enhancements and new design initiativeswhichconcern the sets of forms and the information systemswhich are used by the public administration for collecting information whichisused or may be used for statisticalpurposes” The aim of the project is to exploit administrative data sources for statistical purposes, working and studying on the administrative forms, but not only. Itisreferred to ALL the administrative data sources: “…which are used or MAY BE USED for statisticalpurposes” CHANGING OR RE-DESIGNING AN ADMINISTRATIVE FORM: AN OCCASION FOR NEGOTIATING CHANGES IN ORDER TO MAKE THE RELATED ARCHIVES MORE USEFUL FOR STATISTICAL PURPOSES The purpose of Istat’sstrategy Italian National Institute of Statistics- Vienna, 2-5 June 2014
Committee for harmonizingadministrativeforms: NEW ACTIVITIES • launchesINVESTIGATIONSON ADMINISTRATIVE DATA ARCHIVES and theirrelatedADMINISTRATIVE FORMS • An INVESTIGATION is • an analysis and documentationactivityconcerning the CONTENT and the QUALITY of the archive • jointlyundertaken by ISTAT and the OWNER INSTITUTION • by means of standard methodological and information managingtools COMMITTEE FOR HARMONIZING ADMINISTRATIVE FORMS Members are nominated by Istat and by administrative data archive owner institutions • releasesRECOMMENDATIONS on INNOVATION PROJECTSconcerningADMINISTRATIVE DATA ARCHIVES and theirrelatedADMINISTRATIVE FORMS • the OWNER INSTITUTION informs ISTAT about the innovationproject • ISTAT evaluates the project and releasesrecommendations Italian National Institute of Statistics- Vienna, 2-5 June 2014
Committee for harmonizingadministrativeforms:THE INFORMATIC TOOLS • DARCAP (DocumentingARChives of Public Administration) is a web-based information management systemwhich: • supports Istat experts in producingSTRUCTURED DOCUMENTATION of the CONTENT of the existingadministrative data archives and of theirmainfeatures (suchas the ownerinstitutions) whichiscollected by INVESTIGATIONS on the archive • supports the ownerinstitutions in sending Istat via web theirCOMMUNICATIONS whichconcern INNOVATION INITIATIVES on administrativeforms and archives • supports Istat experts in producing STRUCTURED DOCUMENTATION of the NEW CONTENT of the administrative data archiveswhich are involved in innovationprojects, and in definingISTAT RECOMMENDATIONS Italian National Institute of Statistics- Vienna, 2-5 June 2014
STRUCTURED DOCUMENTATION OF THE CONTENT OF THE ADMINISTRATIVE DATA SOURCES: THE ONTOLOGY EXAMPLES: Student, Degree course, Employer, Employee, Patient, Hospital, Business, Local unit SET (collection of observable items) two types: which get a qualitative classification item or a value taken from FEATURE CLASSIFICATIONS (sets of admittable items for a qualitaitve feature), DOMAINS OF VALUES POPULATION (subset of reference populations such as families, persons, businesses, organizations) QUANTITATIVE OR QUALITATIVE FEATURES, VARIABLES has SET OF EVENTS which occur in time EXAMPLES: Sex, Age, Turnover, Type of emplyment contract, Date of university enrollment, Duration of hospitalization two types: EXAMPLES : Sex classification, Age classes, Amount of turnover has INSTANT EVENT has DURABLE EVENT EXAMPLES: University enrollment, Examination, Hiring on, Hospitalization, Hospital discharge EXAMPLES: Employment contract, Hospital stay, Current degree course
STRUCTURED DOCUMENTATION OF THE CONTENT OF THE ADMINISTRATIVE DATA ARCHIVES: WHICH INFORMATION AN ADMINISTRATIVE DATA ARCHIVE COLLECTS – QUALITY EVALUATION • The content of the observedSETSevolves in time • Elements of referencestatisticalpopulationssuchas families, persons, businesses, organizations, with theirfeatures, enterinto or exit from thoseparticular POPULATIONS which are observed by the administrativearchive (population’scoverage) • Example: a personsenrolls in a university and becomes a student, a studentgets a degreee and isnot a studentanymore • New instantevents with theirfeaturesoccur and enterintoSETS OF INSTANT EVENTS • New durableevents with theirfeaturesbegin and enterintoSETS OF DURABLE EVENTS, or finish • The administrative data source observesuch an evolution,through a CONTINUOUS DATA COLLECTIONactivity (coverageproblems) Italian National Institute of Statistics- Vienna, 2-5 June 2014
Committee for harmonizingadministrativeforms:THE METODOLOGICAL TOOLS • To set up a framework of qualityindicators, classified by qualityhyperdimension and dimension, for assessing and documenting the overallquality of anyadministrative data archive • Weintendqualityasquality for statisticalusage, butirrespective of anyparticularstatisticalusage • We take into account the existing work, particularly the BLUE-ETS experience • We work in accordance with the Statistical Network on Administrative Data - UN-ECE, OECD, Eurostat • Givemethodologicaldirections for using the structureddocumentation of the content of the archiveas a guide in specifyingproperindicators for anyarchive, aswellas in interpreting the meaning of the obtainedindicators • Which are the properindicators for eachobserved SET with itsrelated FEATURES/VARIABLES • For eachobtainedindicator, whichkinds of erroritmeasures Italian National Institute of Statistics- Vienna, 2-5 June 2014
FRAMEWORK FOR QUALITY IDICATORS: THE STRUCTURE Hyperdimensions Dimensions QualityIndicators Hyperdimensions:Source, Metadata, Data Qualityindicators Source 1. Keeper 2. Relevance and utilizations3. Privacy and security 4. Availability • Regulatoryreference of the archive • PotentialUtilization • Data release …. • Documentationavailability of the archive’scontent • Existingidentification code, for everycollective • …. Quantitative indicators CALCULATED ON DATA Metadata 1. Clearness and standardization of the data documentations2. Comparability of the data documentations 3. Identification code and linkingvariables4. Documentationof the collectionprocedures and use of personal data. Data (in progress) Coverage and identifications, Integrability, Accuracy, Time-related dimension Italian National Institute of Statistics- Vienna, 2-5 June 2014
Quantitative indicators and possibleerrors • In order to define such quantitative indicators, first we have discriminated between possible errors, on one side, and ways of checking them, on the other side. • The possible errors are defined in terms of those objects that may be present in an administrative data source’s ontology…an example! Italian National Institute of Statistics- Vienna, 2-5 June 2014
STUDENTS REGISTRY Enrollment (Enrollmenti 1/1/2012) ENROLLMENTS and their features t Student (Rossi 1/1/2012_t) Student (Verdi 1/1/2010_t) STUDENTS Student (Rossi 1/1/2012_t) Student (Bianchi 1/1/2012_t) wrong inclusion wrong exclusion Careers and their features Career (Careerm1/1/2012_t) Examinations and their features Examinations (Examinationj 1/1/2013) Degree (Deegreek 1/1/2012) Degrees and their features
Quality check methods Quality indicators’ frame concerning the collectives’ coverage and the elements’ identification • Searching evident errors (duplicate identification codes) • Linking with other data sources • Using logical constraints • Calculating time lags Two other quality indicators’ frames: the possible errors on characteristics and relationships Italian National Institute of Statistics- Vienna, 2-5 June 2014
Conclusions • We are now ready to start the investigation activity and the supervision activity on innovation projects concerning administrative data sources at operating speed • We are now carrying on our work of specifying indicators in the Data hyperdimension • We plan to integrate the existing indicators, such as the BLUE-ETS indicators, in our indicators’ framework • We are aiming to provide foundations for future research aimed at building a generalized probabilistic frame for the quality assessment activity Italian National Institute of Statistics- Vienna, 2-5 June 2014