140 likes | 254 Views
Selection of Data Sources for Observational Comparative Effectiveness Research. Prepared for: Agency for Healthcare Research and Quality (AHRQ) www.ahrq.gov. Outline of Material. This presentation will:
E N D
Selection of Data Sources for Observational Comparative Effectiveness Research Prepared for: Agency for Healthcare Research and Quality (AHRQ) www.ahrq.gov
Outline of Material This presentation will: • Propose data source(s) that include data required to address primary and secondary research questions • Describe details of data source(s) selected for the study • Describe validation or other quality assessments that have been conducted on the data source that are relevant to the data elements required for the study • Describe what patient identifiers are necessary for the research purpose, how they will be protected, and the permissions/waivers required • Provide details on the data-linkage approach and the quality/accuracy of the linkage, if applicable
Introduction • The research question dictates the type of data required. • The data source must be chosen carefully to ensure it: • Can address the study question • Has a sufficient number of observations • Contains key variables including potential confounders • Allows for sufficient length of followup
Primary Data • Primary data are collected by the investigator directly from study participants to address a specific question or hypothesis. • The observational research designs that often require primary data collection are described below: • Prospective observational studies: • Subjects are selected on the basis of specific characteristics, and their progress is monitored. • The investigator does not assign the exposure of interest. • Registries: • Registries use an observational study design to collect data and do not specify treatments or require therapies intended to change patient outcomes. • Registry data can be used for public health surveillance, to determine incidence rates, to perform risk assessment, and to monitor progress.
Secondary Data (1 of 3) • Secondary data is data collected for other purposes that can be used to answer the research question. • Electronic medical record data • The method of data collection is not standardized nationally, and the intervals between visits vary for every patient. • Patients may receive care at different facilities not captured in the electronic medical record. • Paper-based records • The exclusion of sites without electronic records may bias results. • These records may be valuable as a source of primary data for validating data available elsewhere (e.g., administrative claims data).
Secondary Data (2 of 3) • Administrative data • Typically generated as part of the process of obtaining insurance reimbursement • International Classification of Disease (ICD) and Common Procedural Terminology (CPT) • Pharmacy data • Claims submitted to insurance companies for payments, as well as pharmacy dispensing records • National Drug Code (NDC): a unique 10-digit identifier to indicate active ingredient(s), dosage and strength, route of administration, and package size and type
Secondary Data (3 of 3) • Regulatory data • The U.S. Food and Drug Administration (FDA) has a vast store of data from regulatory approval submissions. • The FDA is converting its older datasets into research-appropriate data to be stored into its clinical trial repository. • These data, however, typically come from efficacy trials rather than effectiveness trials. • Repurposed trial data or data from completed observational studies • Investigators who conduct clinical research studies funded by the Federal Government are mandated to make their data publicly available. • The available data are limited to what were collected.
Considerations for Selecting Data • Required data elements • Identify the minimum data requirements a priori • Supplement datasets by predicting absent data variables or by linking to available data • Time period and duration of followup • Essential that the data capture the time period of relevance • Inadequate length of followup often makes data unusable • Followup should take the natural history of the outcomes into account
Ensuring Quality Data (1 of 2) • Missing data • Can compromise validity of the study, depending on whether there is a pattern in the type and extent of “missingness” • Define a priori an acceptable percent of missing data for key data elements required for analysis • Use imputation techniques for missing data • Changes that may alter data availability and consistency over time • Is either a singular event or a gradual shift over time • Evaluate points in time where the database is discontinuous (trend breaks)
Ensuring Quality Data (2 of 2) • Validity of key data definitions • Assess the general definition of key variables and their reliability and validity in the chosen database. • Commonly used data resources may have validated outcomes of health events (e.g., heart attack, mortality). • Creating best definitions of key variables often requires involvement of clinicians. • Compare study data to a “gold standard,” such as paper-based medical records if available.
Emerging Issues and Opportunities (1 of 2) • Data outside the United States • It may be easier to obtain complete, long-term data from countries with single-payer systems that have regional or national electronic medical record systems. • Consider possible regional variations in exposure, availability, cost, and health care systems. • Point-of-care data collection and interactive voice response/other technologies • The advantages of a new and timely data stream include more detailed data that can be used to spot trends or patterns. • Data streams are often specialized.
Emerging Issues and Opportunities (2 of 2) • Data pooling and networking • Data pooling: combing data, at the level of the unit of analysis, from several sources into a single cohort • Networking: common protocols, data definitions, and programming developed for several data resources • Personal health records • Electronically stored health records initiated by the patient • Patient-reported outcomes (PROs) • Increased availability of PROs in administrative data of value to comparative effectiveness research