1 / 24

OASIS Electronic Trial Master File Standard Technical Committee Content Classification Layer

OASIS Electronic Trial Master File Standard Technical Committee Content Classification Layer. January 20, 2014 9:00 – 10:00 AM PST. Agenda. Roll Call. Meeting Etiquette. Announce your name prior to making comments or suggestions Keep your phone on mute when not speaking (#6)

yoland
Download Presentation

OASIS Electronic Trial Master File Standard Technical Committee Content Classification Layer

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. OASIS Electronic Trial Master File Standard Technical Committee Content Classification Layer January 20, 2014 9:00 – 10:00 AM PST

  2. Agenda

  3. Roll Call

  4. Meeting Etiquette • Announce your name prior to making comments or suggestions • Keep your phone on mute when not speaking (#6) • Do not put your phone on hold • Hang up and dial in again when finished with your other call • Hold = Elevator Music = very frustrated speakers and participants • Meetings will be recorded and posted • Another reason to keep your phone on mute when not speaking! • Use the join.me “Chat” feature for questions / comments / Votes • We will follow Robert’s Rules of Order From eTMF Std TC to Participants: Hi everyone: remember to keep your phone on mute  NOTE: This meeting is being recorded and minutes will be posted on TC page after the meeting

  5. Outreach Subcommittee • Status – New Members: • Oracle – Joined • In Progress: EMC, Kaiser Permanente, Shire, Medtronics • Activities / Milestones

  6. Tech Discussion • Status • Timeline • In parallel with other Tech work from charter

  7. Content Classification System Discussion • Classification System Components: • Classification Categories • Taxonomy, hierarchy • Metadata (‘Tags’) • Characterizes content • Content Model • Published set of classifications, metadata for a domain (e.g., eTMF)

  8. Classification Categories Component Classification Categories Hierarchy Classification Categories Component • Hierarchy of categories • Categories, subcategories, content types • Defined relationships with rules: Parent-Child • All categories, content types required to have unique names and machine codes • Each content type is associated with Metadata Properties (includes core and domain-specific) • Content items are linked to content types. • Unique classification and term codes based on Universal Decimal Classification System (UDC) numbering, widely used in libraries worldwide. Human and machine readable; infinitely expandable • Can be described, edited and validated using OWL editor (like open source editor Protégé’) • Supports any simple text vocabulary, including TMF Ref Model and other vocabularies • W3C OWL2 and RDF/XML supported Study Digital Content

  9. Metadata Component Core Metadata Example – File Properties: Metadata Component • Used to tag or index digital content items Metadata Classes: Core - Comprised of four areas: File Properties, Classification, Audit Trail Business Process Domain-specific -- Metadata for a domain in life sciences such as eTMF, finance, legal administration, or others. Uses standards-based terms from groups like NCI Org Specific – Metadata that meets organizations needs – not standards based General – obtained from public standards-based vocabulary terminology resources like dublin core Annotation Properties Metadata about classification categories and metadata: • Core, Org-Specific metadata

  10. Content Model Component Content Model Component • Contains classification hierarchy, metadata in machine readable format:

  11. Classification System – Term Sources Term Sourcing Concepts: • Terms adopted by standards bodies should be used first in eTMF model Primary Term Sources for eTMF Classification System: • Internet Standards Dev Orgs: W3C, IETF, ISO, etc. • Required for interoperability of machine code • NIH NCIthesaurus: Term database for FDA, CDISC, HL7, other orgs • Required for interoperability of clinical / health sciences data Secondary Term Sources for eTMF Classification System: • Industry sources – widely used terms in enterprise content mgmt software, TMF RM *Spec, Table 6, p21

  12. Classification Categories Component Classification Categories Hierarchy and Numbering [1]: Classification Categories Component • Classification hierarchy and numbering is based on UDC library numbering standard and XML naming • Digital dot notation – Designed for human and machine readability • Each number is also a unique code for naming and ordering in the hierarchy • Primary Categories (PC): Three digit. eTMF: 100-200 • Subcategories (SC): Two digit: 10-99 • Content Types (CT): : Two digit: 10-99 • Maximum number of Sub-Category divisions is 5,excluding the 3-digits for the Primary Category [1] Per spec section 2.1.1; 6.0 • Hierarchy Numbering/Naming Considerations: • Flexible, standards-based approach (W3C XML compliant naming*) • Ability to add multiple hierarchy divisions / levels • Proposed: 5 divisions = [100*905) = 5.9x1011Content Types • Uniqueness of numbers – usable as machine code identifiers • Machine readable, human readable • No sorting issues, no need for leading zeros*, no special chars • *Leading zeros in XML syntax are ignored: • http://www.w3.org/TR/REC-xml/

  13. Classification Categories Component Numbering and Naming Scheme Numbering • Primary Categories and Sub-Categories : • Category Code number • Content Type: • Content Type ID Naming • Primary Categories and Sub-Categories • Simple text-based names • Unique name, 64 char limit • Abbreviation – 16 char limit suggested • Compatible with W3C XML naming standards : No special characters : ( ) < > ? / % # @ ! Example: Classification Categories Hierarchy, Naming, Numbering

  14. Classification Categories Component Modifying Classification Category Entities – General Editing Rules Domain Specific – Classifications cannot be deleted –> Reserve/Unreserve – Modifications allowed to some annotation properties (see spec) – Codes (Category Codes, CT Type ID) cannot be generated Organization Specific – Classifications can be deleted – Modifications allowed for classification metadata, annotations – Codes (Category Codes, CT Type ID) can be generated Classification Category, Content Type Editing Rules* **Annotation metadata *Spec, Table 6, p21

  15. Classification Editing Tool – Free, Open Source Protégé (From Stanford University: http://protege.stanford.edu/ ) Protégé Editor: -Edit Classification Taxonomy and Metadata Terms -Validate Taxonomy and Term name compliance -Create valid RDF/XML Ontology *Spec, Table 6, p21

  16. Classification Categories - Summary Proposed Classification System has following Properties: • Based on Naming and Numbering that is W3C XML compliant • No special characters: ( ) & # @ / … etc. • No leading zeros in classification numbers • Based on Universal Decimal Classification (UDC) system for content classification: • 100199 : eTMF Domain • UDC system used in 170+ countries worldwide; expandable, human and machine readable, sortable http://en.wikipedia.org/wiki/Universal_Decimal_Classification • Flexible and customizable for organizations, yet interoperable • Domain classifications – Standardized; Organization-specific classifications – Editable • Defined set of rules for Editing, modifying Taxonomy • Any Organization can Modify/Edit taxonomy using open source editors like Protégé *Spec, Table 6, p21

  17. Appendix

  18. Classification System – Core Terms Content Classification System – Core Terms needed for Architecture – Objectives: • Classification, Subclassificationconcept - • Supports RDF/XML, OWL languages • Non-domain specific, generic terms • Easily understandable by anyone - conveys concept • Conveys hierarchy • No conflicts – not a reserved term in RDF/XML, OWL or other compilers/ IDE’s • First priority – Source terms from standards bodies *Spec, Table 6, p21

  19. Classification System – Core Terms Content Classification System – Core Terms needed for Architecture • Classification, Subclassificationterm concept: Proposed Term *Spec, Table 6, p21

  20. Classification System – Core Terms Content Classification System – Core Terms needed for Architecture • Classification, Subclassificationterm concept: Proposed Term *Spec, Table 6, p21

  21. Classification System – Core Terms Content Classification System – Core Terms needed for Architecture – Objectives: • Content Type concept • Supports RDF/XML, OWL languages • Non-domain specific, generic terms • Easily understandable by anyone – conveys concept • No conflicts – not a reserved term in RDF/XML, OWL or other compilers/ IDE’s • First priority – Source terms from standards bodies *Spec, Table 6, p21

  22. Classification System – Core Terms Content Classification System – Core Terms needed for Architecture • Content Type term concept: Proposed Term *Spec, Table 6, p21

  23. Classification System – Core Terms Content Classification System – Core Terms needed for Architecture • Content Type term concept: Proposed Term *Spec, Table 6, p21

  24. Draft Agenda: Next Meeting • Roll call • Reports • Outreach • Tech Discussion: Classification Layer: Core Metadata (Charter item 2, p.2) • New business

More Related