10 likes | 119 Views
Applying Linked Data Principles to Represent Patient ’ s Electronic Health Records at Mayo Clinic: A Case Report Jyotishman Pathak, PhD Richard C. Kiefer Christopher G. Chute, MD, DrPH Division of Biomedical Statistics and Informatics, Department of Health Sciences Research
E N D
Applying Linked Data Principles to Represent Patient’s Electronic Health Records at Mayo Clinic: A Case Report Jyotishman Pathak, PhD Richard C. KieferChristopher G. Chute, MD, DrPH Division of Biomedical Statistics and Informatics, Department of Health Sciences Research Mayo Clinic, Rochester, MN Background and Aims Semantic Web LCD: Linked Clinical Data • Patient recruitment is a huge bottleneck in conducting clinical trials and research studies • 50% of time is spent in recruitment • Low participant rates (~5%) • Clinicians lack resources to help find patients appropriate studies; patients encounter difficulties locating appropriate studies • Electronic Medical Records (EMRs) with clinical information provide new opportunities for rapid cohort identification • Historical, longitudinal data • Diagnoses, procedures, labs, drugs etc. • Using EMR data, however, has challenges • Non-standardized • Semantically heterogeneous • Largely unstructured • Specific Aims: • Investigate ontology-based techniques for representing and encoding phenotype data • Framework for ontology-based phenotype data integration and federated querying • Develop semantic reasoning techniques for cohort identification in cardiovascular diseases and pharmacogenomics • Ontologies provide a formal specification of how to represent objects, concepts, and relationships among them • Ontologies can be used for: • Naming “things” (annotation) • Modeling a domain of interest • Computational reasoning over data • Driving Natural Language Processing • Semantic information integration • Ontologies in the biomedical domain: • Genotype: Gene Ontology • Diseases/Findings: SNOMED-CT, ICD • Laboratory Measurements: LOINC • Drugs: RxNorm, NDF-RT The Semantic Web is a Web of Data. It provides a common framework that allows data to be shared and reused across application, enterprise and community boundaries • Several technologies enable the Semantic Web • MCLSS is a collection of data from operational, research and external databases • Single-point access to multiple data sources in a common format for centralized querying • Billing and diagnoses • Pathology • Medical procedures • Demographics • Orders and medications From Relational Data Model to RDF Mapping to Querying via SPARQL PREFIX sider: <http://www4.wiwiss.fu-berlin.de/sider/resource/sider/> PREFIX semr: <http://edison.mayo.edu/schemas/lss1p/> PREFIX rxnorm: <http://link.informatics.stonybrook.edu/rxnorm/> SELECT DISTINCT ?MCLSS_KEY { { SERVICE <http://www4.wiwiss.fu-berlin.de/sider/sparql> { SELECT ?mySideEffect ?mySideEffectLabel WHERE { ?x rdf:type sider:drugs ; rdfs:label "Prandin" ; sider:sideEffect ?mySideEffect . ?mySideEffect rdfs:label ?mySideEffectLabel . } } } { SERVICE <http://link.informatics.stonybrook.edu/sparql/> { SELECT DISTINCT ?rxnormCode WHERE { ?rxAUIUrl rxnorm:hasRXCUI ?rxCUIUrl . rdfs:label ?rxnormLabel ?rxCUIUrl rxnorm:RXCUI ?rxnormCode . FILTER(regex(str(?rxnormLabel), "Prandin", "i")) . } } } { SERVICE <http://edison.mayo.edu/lss1p#> { SELECT DISTINCT ?MCLSS_KEY WHERE { ?icd9Url semr:dx_code ?icd9Code ; semr:dx_abbrev_desc ?diagnosis . FILTER(regex(str(?diagnosis), str(?mySideEffectLabel), "i")) . ?patientUrl semr:whkey ?MCLSS_KEY ; semr:diagnosis ?diagnosisCode . semr:concept_id ?rxnormCode . FILTER(regex(str(?icd9Code), str(?diagnosisCode), "i")) . } } } } 1. Use an ontology to describe the columns of the relational database. Names of the columns could be used, but it would not promote Linked Data. 2. Follow the relational to RDF mapping tool’s syntax to express the relationship between the columns in the database and the terms in the ontology. Prefixes are partial URLs used as shortcuts to define a location for your mapped data. 3. Write a SPARQL query using the terms described in the mapping. Federated queries introduced in SPARQL 1.1 allow the querying of data from multiple endpoints. Current Technical Architecture Mayo Clinic Life Sciences System Biomedical Ontologies Using Virtuoso, the patient data stored in MCLSS is surfaced into a SPARQL endpoint. By mapping the query concepts to the columns in the database tables, SPARQL queries are automatically translated into SQL statements which return the results for endpoint access. Client applications send query requests, such as “Find patients with diabetes who have side effects from Prandin”. Using the Linked Data API, the request is translated into a federated SPARQL query which pulls data from the SIDER, RxNorm and MCLSS endpoints. + Personnel