230 likes | 453 Views
OASIS Electronic Trial Master File Standard Technical Committee Metadata Component Layer Discussion. February 3, 2014 9:00 – 10:00 AM PST. Agenda. Roll Call. Meeting Etiquette. Announce your name prior to making comments or suggestions Keep your phone on mute when not speaking (#6)
E N D
OASIS Electronic Trial Master File Standard Technical Committee Metadata Component Layer Discussion February 3, 2014 9:00 – 10:00 AM PST
Meeting Etiquette • Announce your name prior to making comments or suggestions • Keep your phone on mute when not speaking (#6) • Do not put your phone on hold • Hang up and dial in again when finished with your other call • Hold = Elevator Music = very frustrated speakers and participants • Meetings will be recorded and posted • Another reason to keep your phone on mute when not speaking! • Use the join.me “Chat” feature for questions / comments / Votes • We will follow Robert’s Rules of Order From eTMF Std TC to Participants: Hi everyone: remember to keep your phone on mute NOTE: This meeting is being recorded and minutes will be posted on TC page after the meeting
Outreach Subcommittee • Status – New Members: • Joined: HL7 • In Progress: EMC, Shire, Kaiser Permanente • Activities / Milestones
Tech Discussion • Address open issues on Content Classification component discussion • Begin Metadata Component discussion
Metadata Component • Metadata Component: • Metadata (‘Tags’) • Characterizes content • Allows users to precisely search for information, create reports, share data online • Use of standards-based terms is critical for interoperability between systems
Metadata Component Metadata Component Example • Each Content Type contains metadata that describes it:
Metadata Component Classification Categories Hierarchy Metadata Component • Metadata is used to tag or index digital content items • Two primary types of metadata: • Data Properties: Describes content type – Study ID, Site ID, Org, etc. • Annotation properties: Describes attributes of content classifications and attributes of data properties. • All content types required to have Core metadata – Like file properties: Content Type Name, URI, Date Created, Date modified, etc. • Core to include ‘Business Process Metadata’: to support business process models using BPMN 2.0 terms and digitalsignaturesfor clinical process automation • Organizations can add their own metadata (Org specific) • Metadata Can be described, edited and validated using OWL editor (like open source editor Protégé’) • Machine readable using W3C OWL2 and RDF/XML A.k.a. Data Properties Study Digital Content
Metadata Classes - Summary Core Metadata Example – File Properties: Metadata Classes – Summary: Core - Always included: File Properties, Classification, Audit Trail Business Process, digital signatures Domain-specific -- Metadata for a domain in life sciences such as eTMF. In addition to the eTMF domain, future domains could be added in many areas (not this TC’s charter). For example, finance, legal administration, healthcare. Domains should use standards-based terms from groups like W3C, NIH NCIt, HL7 Org Specific – Metadata that meets organizations needs – not standards based General – Non-core terms that are relevant to the content type. Terms are obtained from public standards-based vocabulary terminology resources like Dublin core, Dicom Annotation Properties Metadata about classification categories and metadata: • Core, Org-Specific metadata
Metadata – Term Sources Term Sourcing Concepts: • Terms adopted by standards bodies should be used first in eTMF model Primary Term Sources for eTMF Metadata: • Internet Standards Dev Orgs: W3C, IETF, ISO, etc. • Required for interoperability of machine code • NIH NCIthesaurus: Term database for FDA, CDISC, HL7, other orgs • Required for interoperability of clinical / health sciences data Secondary, Tertiary Term Sources for eTMF Metadata: • Medical & Published Standards metadata: Dicom (med imaging); Dublin Core • Industry sources – widely used terms in enterprise content mgmt software, TMF RM
http://purl.bioontology.org/ontology/CareLex/ Metadata ExampleeTMF Domain • Example: eTMF Content Model • Published at National Center for BioMedical Ontology • (NIH funded) • Each Content Type has core metadata (Data Properties) • Each Content Type has eTMF domain metadata (Data Properties) • All Content Types, Categories have Annotation properties Content Type Metadata (Data Properties) Annotation Properties
Metadata Term (Data Properties) Modification Rules* • Core Metadata Terms: Cannot be modified • Domain-specific (e.g., eTMF), General, and Organization-specific metadata terms: Can be added to content models: • Domain specific – Terms sourced from NCI thesaurus or standards dev organization (SDO). • Org specific - When possible, new org specific terms should be sourced from NCI thesaurus or other SDO for interoperability; helpful but not required. • General – Sourced from Dublin Core, DICOM, SDO • 2. To insure interoperability, the unique code value assigned to each metadata term cannot be modified. • 3. Core and Business Process Metadata Properties can be reserved/unreserved. Other types of Metadata Properties can be deleted. • 4. Only certain Annotation Properties' values can be modified for different types of Metadata Properties. See CareLex section 2.1.2 for details • Metadata Terms (data properties) can be modified, edited using the open source Protégé Editor using OWL format and saved as RDF/XML *Per CareLex Spec section 2.1.2
Metadata Term (Annotation Properties) Modification Rules* • Rules to Modify Annotation Properties: • 1. Core Annotation properties can neither be deleted nor reserved. However, Organization-specific annotation properties can be deleted. • 2. Only certain Annotation properties' values can be modified for different types of metadata properties (see CareLex section 8.2 for further details) • Annotation properties can be modified, edited using the open source Protégé Editor using OWL format and saved as RDF/XML *Per CareLex Spec section 2.1.2
Metadata Editing Tool – Free, Open Source Protégé (From Stanford University: http://protege.stanford.edu/ ) Protégé Editor: -Edit Metadata: -Annotation Properties -Data Properties -Validates metadata relationships and W3C Term name compliance -Creates valid machine readable RDF/XML Ontology *Spec, Table 6, p21
Core Metadata Terms Note: Core metadata terms should be included for each content item. Terms with required Data values = * *For additional info, see Spec, Appendix 8
Core Metadata Terms, Continued *For additional info, see Spec, Appendix 8
eTMF Domain Metadata Terms Note: Study ID and Country metadata terms should be included for each content item in the eTMF Domain and are marked * All other terms assigned to content types based on the published domain content model. For example ‘Site ID’ is assigned to content types within the ‘Site Management’ category. See published eTMF content model for details. All other terms are optional. Additional eTMF Domain Metadata terms may be added as needed in ‘Phase 2’ of the eTMF TC project *For additional info, see Spec, Appendix 8
General Metadata Note: General Metadata is not required, but is obtained from published standards organizations such as Dublin Core, DICOM, and other standards organizations
Metadata Component - Summary Proposed Metadata Component has following Properties: • Based on metadata terms from published, standards-based databases: • W3C, NIH NCI thesaurus, HL7, Dublin Core, DICOM • W3C XML compliant • No special characters: ( ) & # @ / … etc. per w3C rules • Flexible and customizable for organizations, yet interoperable • Core metadata – allows interoperable data exchange between org’s for domain • Core: Always included with domain; metadata attributes not modifiable--interoperability • Org specific metadata – allows use of custom metadata for a specific study, instance • Defined set of rules for exchanging, adding, modifying non-core metadata • Any Organization can Modify/Edit org-specific metadata using open source editors like Protégé • Additional eTMF domain metadata may be added in Phase 2 of eTMF TC if required
Content Model – OWL RDF/XML Format • Content Model: • Comprised of Classification Categories, Metadata- a ‘filing plan’ • Represented as RDF/XML Machine readable code • Created for a specific domain instance (e.g, Study) • Content models can be created, shared with anyone online, offline • Easily editable in Protégé’ • See specification section 8.3.1 for additional details on RDF/XML file format and content model interoperability Data Model: Content Model instance and instance data in XML packet* Content Model: Classification Categories + Metadata in RDF/XML: Content Model Instance Data Values Content Resources PDFs, media (URI) Content model – for conceptual model design, editing, exchange *Data model – for data exchange (future TC meetings)
Content Classification System Discussion Summary • Classification Categories Component: Naming, Numbering • Metadata Component: Interoperable metadata • Content Model: Machine Readable, standard format for exchange of models *For additional info, see Spec details on Content Classification System
Draft Agenda: Next Meeting • Roll call • Reports • Outreach • Tech Discussion: Electronic/Digital Signatures, eTMF Data Model • New business