380 likes | 510 Views
Primary and Secondary use of EHR systems. Meeting The Technical Security Needs. Filip De Meyer 12-10-2007. Content. Custodix: Company Introduction Concepts & Terminology From Concept to Technical Solutions Example: The C ustodix A nonimisation T ool (“CAT”) (screen shots).
E N D
Primary and Secondary use of EHR systems Meeting The Technical Security Needs Filip De Meyer 12-10-2007
Content • Custodix: Company Introduction • Concepts & Terminology • From Concept to Technical Solutions • Example: The Custodix Anonimisation Tool (“CAT”) (screen shots)
In a few words… Established in 2000 as a spin-off company of the University of Ghent, Belgium Providing Privacy Protection services, mainly in HealthCare Trusted Third Party Services Customized Privacy EnhancedData Collection Solutions Secure storage Privacy Consultancy … “One stop shop” for privacy/data protection Involved in European Research since the start Operating in Europe, Australia and Asia About Custodix 3
Commercial & Research Activities Research Programs Commercial 4
Countries involved (sources of data) in Custodix protected data flows. Scope of Activities 5
Data Protection legislation examples: Europe: European Directive 95/46/EC (accepted as one of the world’s highest privacy standards) Member state implementation Other: Health Insurance Portability and Accountability Act (H.I.P.A.A.) Ontario Freedom of Information and the Protection of Privacy Act in Canada … Background/History of Activities 6
EHR Sources Research Use Various EHR Sources (care/diagnostic purposes) Research Data Repositories Trusted Third Party Personal Health Records (e.g. personal diaries) + Other Sources • link • protect privacy Research Data Repositories Additionally Collected Data (for research purposes)
Reduction of Identifying Information Risk Analysis delete identifier transform date de-identified data produce nym personal data delete data items encrypt data items … Reduce Identifying Information Content
Starting Point: Definition of Personal Data “'personal data' shall mean any information relating to an identified or identifiable natural person ('data subject'); an identifiable person is one who can be identified, directly or indirectly, in particular by reference to an identification number or to one or more factors specific to his physical, physiological, mental, economic, cultural or social identity.” (Directive 95/46/EC, the “DPD”)
Concept of Identification a f b c e d g h set of characteristics Set of data subjects • A data subject is identified (within a set of data subjects) if it can be singled out among other data subjects. • Some associations between characteristics and data subjects are more persistent in time (e.g. a national security number, date of birth) than others (e.g. an e-mail address).
The Concept of Anonymisation a f b c e d g data subject h set of characteristics • Anonymisation is the process that removes the association between the identifying data set and the data subject. This can be done in two different ways: • by removing or transforming characteristics in the associated characteristics-data-set so that the association is not unique anymore and relates to more than one data subject. • - by increasing the population in the data subjects set so that the association between the data set and the data subject is not unique anymore.
Terminology: Pseudonymisation ? a f b c e d g h Pseudonym set of characteristics Pseudonymisation is a particular type of anonymisation that, after removal of the association with a data subject, adds an association between a particular set of characteristics relating to a data subject and one or more pseudonyms. The pseudonym may be unique in in a domain. In irreversible pseudonymisation, the conceptual model does not contain a method to derive the association between the data-subject and the set of characteristics from the pseudonym. Note that “pseudonymisation” and “anonymisation” terminology is not universal
The Conceptual vs. Real Life Model “To determine whether a person is identifiable, account should be taken of all the means likely reasonably to be used either by the controller or by any other person to identify the said person; whereas the principles of protection shall not apply to data rendered anonymous in such a way that the data subject is no longer identifiable; whereas codes of conduct within the meaning of Article 27 may be a useful instrument for providing guidance as to the ways in which data may be rendered anonymous and retained in a form in which identification of the data subject is no longer possible”. (Recital 26 of the DPD) • refine the concept of identifiability/anonymity. • take into account “means likely and “any other person” in through re-identification risk analysis
Levels of De-identification (ISO/IEC DTS25237) • Level 1: removal of clearly identifying data (“rules of thumb”) • Level 2: static, model based re-identification risk analysis • Level 3: continuous re-identification risk analysis of live databases Targets for de-identification can be set and liabilities better defined in risk analysis and policies.
ISO TC215 / WG 4ISO/IEC DTS25237 (Approved T.S.) • Health Informatics: Pseudonymisation • Result of work in ISO/ TC 215/ WG4 • Based on conceptual model as explained in this presentation • Lists a number of Healthcare scenarios • clinical trials • clinical research • public health monitoring • patient safety reporting (adverse drug events) • Current status: Approved Technical Specification
Disease Management, Clinical Trials, … requirements Dynamic data collection of individual line data… Longitudinal studies Processing data of individual patients Protection of data subjects towards data collector Data must be stored in protected form Different from disclosure control Requires De-identified individual line data Pseudonymisation / anonymisation no protection through aggregation, data swapping, … A-priory estimation of privacy risks and required data protection measures Privacy risk based on statistical modelscfr. re-identification theory Protection of the “context” in which data is considered anonymous Common Healthcare Requirements 18
Goal: Protection of identity and privacy of individuals or organizations Allowing linkage of data associated with pseudo-IDs irrespective of the collection time (cf. longitudinal studies) and collection place (cf. multi-center studies) Simplified: Translating a given identifier into a pseudo-identifier by using secure, dynamic and (preferably ir-)reversible cryptographic techniques Tricky part: Making sure that data is truly de-identified(within a predefined context) Removing “indirectly identifying” content Pseudonymisation 19
Batch Data Collection Sources Trusted Third Party Data Collection Site • Build custom solutionsusing standard components • Integrate security & privacy componentsinto existing and new projects 20
The “interactive pseudonymisation system” Reconciling the concept of a “central anonymous database” with “nominative access” Interactive Pseudonymisation Privacy Protection Gateway 21
Data Protection Service (acting as reverse proxy) Non-intrusive to the application (transparent) Key Management Service Secured Search Service Provides Authentication and user management to the application Web Enabled Implementation of Privacy Enhanced Storage Framework Sources Data Collection Site PESF Service Browser API available as FLASH or Java/JavaScript toolkit 22
Secure Communication Anonymous Data Collection Secured Repository Case: Combined Trust Services Custodix PKI Providing - Authentication - Addressing Direct - Directory Services Messaging Custodix Policy - Account Management Controlled SIX Environment a t Custodix Module a k D c s a u b o encryption d m e e y F n anonymisation o + n d A a communication o l p U Custodix EHR Data Repository Export and access according to a strict policy • Secure Information eXchange • State-of-the-art Implementation based on innovative security technology 23
Developing a Biomedical GRID infrastructure for sharing Clinical and Genomic expertise Core Activities • Integration … of clinical history, medical imaging and genetic data. • Knowledge Grid … distributed mining for knowledge extraction. • Clinical Trials … breast cancer & pediatric nephroblastoma
Center for Data Protection • Act as "data controller" or assist "data controllers" in the sense of the European Directive 95/46/EC on the protection of individuals with regard to the processing of personal data and on the free movement of such data; • Be a think-tank for everyone professionally involved or interested in practical data protection; • Promote the application of novel technology in the context of data protection (ePrivacy , eSecurity), and act as a dissemination point for practical solutions; • Get involved with the development and promotion of standards and certification related to privacy protection; • Provide assistance in dealing with complex data protection issues on an international level by offering access to a multidisciplinary pool of expertise.
Generate privacy protection profiles that can be run on heterogeneous data. • Create (profile) once, run many times....
CAT: Variable Mappings Editor, XML • Variable mappings (dicom, xml, csv, custom) • Define a privacy type /variable • Identifier • Free text • Undefined • ...
CAT: Transformation Editor • Operands • named variable (e.g. patientID) • privacy type • Flexible and detailed configuration • simple nym transformation • secure vaults (single or multiple argument) • random • replace with value • clear • make date relative • ...
CAT XML Example: Result before after • “firstname” replaced by calculated nym • “last name” cleared
CAT: Key Handling • generate keys • store keys • import/export • ...
CAT: DICOM Examples replaced by nym examples original cleared
Custodix NV Verlorenbroodstr. 120 B-9820 Merelbeke Belgium http://www.custodix.com/ or info@custodix.com Thank you for your attention! Any Questions? 38