560 likes | 835 Views
Semantic Models for CDISC Based Standards and Metadata Management. Presented by Kerstin Forsberg, R&D, AstraZeneca Frederik Malfait, IMOS Consulting and Hoffmann-La Roche. Key Message. Things converge to create new and unique opportunities.
E N D
Semantic Models for CDISC Based Standards and Metadata Management Presented by Kerstin Forsberg, R&D, AstraZeneca Frederik Malfait, IMOS Consulting and Hoffmann-La Roche
Key Message • Things converge to create new and unique opportunities. • The coverage and maturity of existing CDISC standards. • The establishment of these standards within the industry. • The use of these standards as a foundation for metadata driven systems. • The upcoming role of semantic web standards and linked data principles.
Two real world use of semantic web standards and linked data principles
Today’s Situation • “Not if and when, but how” to best adopt CDISC based data standards is becoming the leading question. • We see a variety of CDISC standards at different levels of maturity, not linked together and published in different formats. • Sponsors are faced with challenges on all levels: architecture, process, and application.
An Emerging Insight • The CDISC standards is all about the meaning of what is studied in the biological and clinical reality (often referred to as concepts). • How these concepts are represented as data elements from protocol to submission, and beyond. • We are dealing with semantics and metadata for biomedical and clinical research knowledge and data. • “Put semantic into the semantic”Use semantic web standards and linked data principles.
RDF Triples • Resource Description Framework (RDF) A general model of how any piece of data, and representations of knowledge, can be expressed as so called triples. subject predicate object (or value) Stockholm type place Stockholm capital Sweden Stockholm subject Port cities in Sweden Stockholm areaCode “+46-8” “http://en.wikipedia.org/wiki/Stockholm” primaryTopic Stockholm
RDF Triples • Triples can be aggregated into graphs with subject and objects as nodes, and predicates as arcs. type City capital Sweden Stockholm subject Port cities in Sweden areaCode “+46-8” “http://en.wikipedia.org/wiki/Stockholm” primaryTopic
RDF Triples • Graphs of triples can be extended across different sources and for different purpose. type City Country type CDISC capital Sweden Stockholm subject Port cities in Sweden subject CDISC InterchangeEU 2012 areaCode “+46-8” Gothenburg “http://en.wikipedia.org/wiki/Stockholm” primaryTopic
RDF Triples • RDF Schema and the RDF based Web Ontology Language (OWL) add a typing mechanism to classify subjects and objects into hierarchies. Thing subClass Place subClass subClass Organization Event Adm.Area subClass subClass type subClass type City BusinessEvent Country type CDISC capital Sweden type Stockholm subject Port cities in Sweden subject CDISC InterchangeEU 2012 areaCode “+46-8” Gothenburg “http://en.wikipedia.org/wiki/Stockholm” primaryTopic
RDF Triples • Google, Bing (Microsoft) and Yahoo use OWL publish a joint vocabulary. Thing subClass Place subClass subClass Organization Event Adm.Area subClass subClass subClass City BusinessEvent Country Exempelhttp://schema.org/City
RDF Triples • NCI use OWL to publish NCI Thesaurus (the source for CDISC’s CT:s) in an RDF/XML format. Hematology Test LaboratoryProcedure CDISC LaboratoryTest NameTerminology CDISC LaboratoryTest Terminology subClass Concept inSubset Has NCIHDParent Concept inSubset HemoglobinMeasurement definition “A quantitative measurement of the amount of hemoglobin present in a sample.” NCI Thesaurushttp://ncicb.nci.nih.gov/download/evsportal.jsp
Linked Open Data Cloud http://lod-cloud.net/ Richard Cyganiak and AnjaJentzsch
Real world use • Two examples of how sponsors have started to use semantic web standards and apply linked data principles. • AstraZeneca: • Integrative Informatics (i2) program establishing the components to let a Linked Data cloud grow across AstraZeneca R&D • Roche • Implementing an internally built MDR.
AZ R&D Linked Data cloud http://research.data.astrazeneca.com/id/clinicalstudy/D5890C00003 http://research.vocab.astrazeneca.com/uDisease/DOID/2841
Roche Biomedical MDR Schema Architecture Production Partial / Future CDISC Standards MetadataManagement Knowledge Management
Roche Biomedical MDR Content • External content • SDTM 1.2, SDTMIG 3.1.2 • NCI Thesaurus, CDISC Controlled Terminology • Integrated Data Standards, Roche and Genentech • Safety and every Roche TA, ~ 2000 data elements • Data Collection and Data Tabulation • Value level metadata • Lab measurements, Unit conversions, Questionnaires • Looking at metadata for • SDTM Conformance Checking, Biomarker (HGNC), …
Roche Biomedical MDR Information Architecture Transformation Models Study & Project Level Metadata Roche GlobalData Standards CDISCData Standards ADaM PRM CDASH SDTM Define +++ BRIDG +++ SHARE +++ NCI Thesaurus +++ Data Element Concepts +++ BiomedicalDomain Model Production Partial Study Design Data Collection Data Tabulation Data Analysis Regulatory Submission Future
Roche Biomedical MDR System Architecture Content Management Content Publishing Metadata Repository Single Point of Access
Roche Biomedical MDR Value Proposition • Current • Integrated knowledge, metadata, and data standards management • System independent information asset • Single point of access • Future • Leverage the SOA interface to create a framework for integrated metadata driven workflow • Integrate MDR and Component Based Authoring capabilities (study design, protocol, CSR)
Key Message • We now see all of these things converge to create new and unique opportunities. • The coverage and maturity of existing CDISC standards. • The establishment of these standards within the industry at large. • The use of these standards as a foundation for metadata driven systems. • The upcoming role of semantic web standards and linked data principles.