880 likes | 1.03k Views
caCORE Training Forms Based Metadata Curation Session 1.
E N D
caCORE TrainingForms Based Metadata CurationSession 1 Course Number: 1061Duration: 2 HoursIntended Audience: Metadata Curators- Using FormsInstructors: Mary Coopercoopermj@saic.comBrenda Maeskemaeskeb@saic.comNCI CBIIT Liaison: Dianne Reeves Assoc Director, NCI CBIIT reevesd@mail.nih.gov
Course Details: Forms Based Metadata CurationOnline Training Etiquette • Be an active learner • Ask lots of questions • Avoid the temptation to multi-task • Please keep your phone on MUTE • Please do not put your phone on HOLD - disconnect from the teleconference to take another call • Please state your name when asking a question or making a comment
Course Details: Forms Based Metadata CurationTraining Glossary • Our training courses use terms and acronyms specific to the caDSR community. To help you with unfamiliar terms, a training glossary is available with definitions and additional information on our training wiki: NCI Training Wiki Glossary of Terms • If you cannot find what you are looking for and would like to request a term be added to the glossary please send an email to: reevesd@mail.nih.gov
Course Details: Forms Based Metadata CurationSession Outline • Learning Objectives • Lesson 1: Data Element Overview • Lesson 2: Analyzing Forms to Find Metadata • Lesson 3: Searching for Metadata • Lesson 4: Evaluating Data Elements for Reuse • Lesson 5: Designating Data Elements • Lesson 6: Alternate Names, Definitions, Question Text and Value Meanings
Course Details: Forms Based Metadata CurationCourse Learning Objectives • When you complete this course, you will be able to: • Provide a Data Element Definition • Describe how Data Elements are formed • Identify the metamodel standard used to guide metadata curation • Describe how metadata is organized in the registry/repository • Discuss how to decompose a question on a Case Report Form (CRF) or Data Collection Instrument (DCI) into a Data Element • Access caDSR tools online to: • Search for Data Elements that match questions on forms • Evaluate candidates for reuse • Designate Data Elements for reuse in your Context • Curate Alternate Names, Definitions, Question Text, and Value Meanings of Administered items to reflect your needs
Lesson 1: Data Element Overview • In this lesson, you will learn: • The meaning of a Data Element • How Data Elements are formed • The metamodel standard used to guide metadata curation • Organization of the caDSR (Cancer Data Standards Registry and Repository) • Two types of reuse
Lesson 1: Overview of the CDE BrowserCDE Review • Data Element (DE) or Common Data Element (CDE) • Terms can be used interchangeably • “Common” refers to a CDE’s potential for reuse • Definition: the set of metadata attributes and descriptors used to identify, define, and represent a single data point or variable being collected • A Data Element is a “complete” piece of metadata • Metadata provides a framework or structured way to collect ALL the information about something
Lesson 1: Data Element OverviewCancer Data Standards Registry & Repository • NCI’s metadata repository is the caDSR (cancer Data Standards Registry and Repository) • caDSR conforms to the ISO/IEC 11179 (2nd Edition) metamodel standard • Metadata in the caDSR is organized and administered by Context • Contexts are groups which create and maintain their metadata • Items contained in the caDSR are referred to as Administered Items which include: • CDEs& their components: • Data Element Concepts (DEC) • Object Classes (OC) • Properties • Value Domains (VD) • Permissible Values/ Value Meanings (concepts)
Lesson 1: Data Element OverviewData Element Construction (1) Data Element is metadata that includes: • Conceptual entity (THING) about which data is being collected = Data Element Concept • Physical representation = Value Domain
Lesson 1: Data Element OverviewData Element Construction (2) • Sample CDE Browser screen showing detailed view of a Data Element • Illustrates that a DE is more than a name and definition • Includes public ID, version, owner, workflow and registration status, and more
Lesson 1: Data Element OverviewView caDSR Contexts • You can view a complete list of caDSR Contexts in the CDE Browser • Link: CDE Browser
Lesson 1: Data Element OverviewMetadata Reuse vs. Designation • Reuse is a key principle of caDSR curation • Two types of reuse: • Designation: Whole Data Elements that meet criteria tagged and reused across contexts. This is covered in detail in Lesson 5. • Metadata Reuse : Components (DEC or VD) of Data Elements reused freely across Contexts. • Curation Tool allows users to search for existing Data Element Concepts and Value Domains to create new Data Elements • Note: If you attempt to curate a DEC or CDE that already exists within your Context, a warning will appear
Lesson 1: Recap Now that you have completed this lesson, you should be able to: • Provide a definition for a Data Element • Describe how Data Elements are formed • Identify the metamodel standard used to guide metadata curation • Describe caDSR (Cancer Data Standards Registry and Repository) structure • List two types of reuse for metadata
Lesson 2: Analyzing Forms to Find Metadata In this lesson, you will learn how to: • Transform a question from a Case Report Form (CRF) into a complete Data Element • Identify the information (from a question on a form) that will help you create a Data Element Concept • Identify the information (from a list of responses) on a form that will help you create a Value Domain
Lesson 2: Analyzing Forms to Find MetadataFinding Metadata on Forms • Data Elements are used to describe the questions and possible answers on forms or other data collection instruments • A one-to-one relationship exists between each question/response and a single Data Element • Today we’ll look at CRF examples • CRFs are used to collect data about individuals participating in a clinical trial or research study • NCI groups assess and use existing Data Elements when creating their CRFs • NCIP Context (and also caBIG) has a Standard Group of forms • Data Elements used on these forms can be accessed using the CDE Browser
Lesson 2: Analyzing Forms to Find MetadataExample Form • We will go through the process of transforming a question on a form (or other data collection instrument) into a Data Element • Example uses a form from the fictional Mountain Lake Oncology Center
Lesson 2: Analyzing Forms to Find MetadataTransforming Questions and Answers into Data Elements On the left side of the slide the question, “Level of Education:” has been taken from a CRF or DCI. On the right side of the slide a break down of a Data Element appears. The Data Element is divide into the Data Element Concept and the Value Domain. Each of these administered items is further broken down showing its components. Object Class and Property comprise the Data Element Concept while Representation Term Qualifier(s), a Representation Term and Permissible Value, Meaning and Description make up the Value Domain.
Lesson 2: Analyzing Forms to Find MetadataStep 1: Identify the Object Class • First step is to analyze the question itself • What or who are we collecting information about? • This question is collecting information from a “Person” • This “who” (Person) is the Object Class piece of a data element
Lesson 2: Analyzing Forms to Find MetadataStep 2: Identify the Property • What specific characteristics are we collecting about the “Person?” • The specific information we are looking for is a person’s Education Level • This “what” (Education Level) is the PROPERTY piece of a data element
Lesson 2: Analyzing Forms to find MetadataStep 3: Map the Question to a Data Element Concept The example of the question, “Level of Education:” appears on the left side of the slide. On the Right side of the slide the DEC has been broken down into Object Class (Person) and the Property (Education Level).
Lesson 2: Analyzing Forms to Find MetadataValue Domain Component Review • How are you going to represent/record your data? • Value Domain is comprised of Representation Term and Representation Term Qualifiers • Primary Representation Term: How you expect your real data to appear • There are 37 preferred (primary) Representation Terms in the caDSR. • Example: Range, Type, Number, Category, Indicator • Representation Term Qualifiers: add specific description to your Primary Representation Term • Representation Terms can have more than one Qualifier • Example: Primary Representation Term = Name(What kind of Name?) • First Name (Qualifier = First) • Last Name (Qualifier = Last) • If the Value Domain includes a pick list of choices there will be a list of Permissible Values associated with the Value Domain
Lesson 2: Analyzing Forms to Find MetadataPrimary Representation Terms NCI Wiki - Primary Representation Terms This is the list of 37 Primary Representation Terms used in curating Value Domains. Anatomic Site; Category; Code; Count; Date; Date and Time; Dose; Duration; Float; Frequency; Grade; Identifier; Ind-3; Ind3-b; Ind-4; Indicator; Integer; Interval; Measurement; Name; Number; Outcome; Range; Rate; Reason; Scale; Score; Source; Specify; State: Status; Text; Time; Type; Unit of Measure; Value; and Yes or No Response
Lesson 2: Analyzing Forms to Find Metadata Locating Preferred Representation Terms The left side of the slide depicts the Value Domain Details tab from the Create New Value Domain screen in the Curation Tool. Clicking the search button above the Primary Concept field will result in the List of Preferred Primary Rep Terms appearing on a new search screen.
Lesson 2: Analyzing Forms to Find MetadataEnumerated vs. Non-enumerated There are two types of Value Domains: • Enumerated • There is a pick list of values OR permissible values • Example: Education Level Type • Non-Enumerated • No list of values • Describes the format of how a value should be represented/recorded • Example: Specify Text (Data type: Character)
Lesson 2: Analyzing Forms to Find MetadataEnumerated Value Domains Enumerated Value Domains: • Permissible Values (PV) – actual data you are collecting; selection from the pick list (ex: 1st grade) • PV Meaning – the meaning of the permissible value or coded data (concept codes are used to create the meaning) • PV Meaning Description – definition or what you mean in your domain for the specific permissible value (ex: what does 1st grade mean?) Note: The use of coded data sets is highly recommended, especially in the case of data aggregation and reporting
Lesson 2: Analyzing Forms to Find MetadataStep 4: Identify how to Represent the Answer – Value Domain How are we going to represent a person’s education level? • How is your information going to be represented/recorded? We have determined from the list our Primary Representation Term = Type • Definition for Type = Something distinguishable as an identifiable class based on common qualities • Do we have a pick list of values? Yes we have an enumerated value domain • 1st grade, 2nd grade= Permissible Values (determine PV Meaning & PV Meaning Description)
Lesson 2: Analyzing Forms to Find MetadataStep 5: Map the Responses to a Value Domain The example of the question, “Level of Education:” appears on the left side of the slide. On the Right side of the slide the VD (Person Education Level Type) has been broken down into Representation Term Qualifier(s) (Person Education Level); Representation Term (Type); and Permissible Value, Meaning & Description (1st Grade – completed first grade; 2nd Grade – completed second grade; 3rd Grade – completed third grade…).
Lesson 2: Analyzing Forms to Find MetadataRemoving Redundant Terms • Each administered item is fully defined on its own and may contain the same terms to define it • In our example: • Data Element Concept: Person Education Level • Value Domain: Person Education Level Type • DE= Person Education Level Person Education Level Type • Best Practice is to remove redundant terms when naming Data Elements • Person Education Level Person Education Level Type Person Education Level Type
Lesson 2: Analyzing Forms to Find MetadataStep 6: Combine the DEC and VD toform a Data Element On the left side of the slide the question, “Level of Education:” has been taken from a CRF or DCI. On the right side of the slide a break down of a Data Element (Person Education Level Type) appears. The Data Element is divide into the Data Element Concept (Person Education Level) and the Value Domain (Person Education Level Type). Each of these administered items is further broken down showing its components. Object Class (Person) and Property (Education Level) comprise the Data Element Concept (Person Education Level) while Representation Term Qualifier(s) (Person Education Level), a Representation Term (Type) and Permissible Value, Meaning and Description ( 1st Grade – completed first grade, 2nd Grade – completed second grade, 3rd Grade – completed third grade…) make up the Value Domain(Person Education Level Type).
Lesson 2: Analyzing Forms to Find Metadata Applying what you have learned (1) • Here is another sample form with questions we can translate into Data Elements • This example has a question concerning the race of an individual
Lesson 2: Analyzing Forms to Find MetadataApplying what you have learned (2) • Who/ What thing am I describing? [Object Class] • What is the characteristic? [Property] • What is the Primary Representation Term? • Can we use qualifiers for the Representation Term?
Lesson 2: Analyzing Forms to Find MetadataApplying what you have learned (3) • Who/ What am I describing? [Object Class] • Person • What is its characteristic? [Properties] • Race
Lesson 2: Analyzing Forms to Find MetadataApplying what you have learned continued • What is the representation term? • Category • Can we use qualifiers for the Representation Term? • Person Race Note: Rep Term qualifiers aren’t required but can help create the value domain name
Lesson 2: Analyzing Forms to Find MetadataResult of a Decomposition • Data Element Concept (Object Class, Property) is: • Person Race • Value Domain (Representation) is: • Person Race Category • Data Element: Person Race Person Race Category = Person Race Category
Lesson 2: Recap Now that you have completed this lesson, you should be able to: • Identify the information (from a question on a form) that will help you create a Data Element Concept • Identify the information (from a list of responses) on a form that will help you create a Value Domain • Transform a question from a CRF into a complete Data Element
Lesson 3: Searching for Metadata In this lesson, you will learn how to: • Search for existing metadata in the caDSR using the CDE Browser • Search for existing metadata in the caDSR using the Curation Tool • Identify the search differences between the CDE Browser and the Curation Tool
Lesson 3: Searching for MetadatacaDSR Tools with Search Capabilities There are two tools you can use to search for metadata to reuse • CDE Browser • CDE Browser • Curation Tool • Curation Tool
Lesson 3: Searching for MetadataSearching with the CDE Browser CDE Browser Search • Use when: • You are looking for Data Elements • You need to see all of the details for a Data Element and its components • You need to compare and evaluate Data Elements • Search Features: • Result are based on Data Elements • Advanced Search allows you to search by Data Element attributes • Tree search gives you the NCI Standards • You can download Data Elements and all of their components (EXCEL or XML) • Doesn’t display retired Data Elements (unless search preferences are changed)
Lesson 3: Searching for MetadataStandards What is a Standard Data Element? • A standard is an indicator of recommended usage across the community • The community has reviewed, vetted, and approved the Data Element for recommended reuse across all Contexts and caDSR users
Lesson 3: Searching for MetadataCDE Browser Search Methods There are two ways to search in the CDE Browser • Tree Search (by Context, Classification, or Protocol Forms) • Search by Public ID, Name, or Key Word
Lesson 3: Searching for Metadata CDE Browser Text Search Example Text Search: • Search for Person Education Data Elements using wildcards (*) to expand your search results • Name (default) searches the Long Name, Preferred Question Text, Definition, and any alternate names and definitions for the search term (other option is Public ID) • 31 Search Results displayed
Lesson 3: Searching for MetadataComparing CDEs in the CDE Browser Comparing CDEs Example • Curator searches *education* • 31 Search results are returned • Select the CDEs to compare • Select the “Compare CDEs” Button
Lesson 3: Searching for MetadataSuccessful Searching – Using the Advanced Search Example Advanced Search – look for Education Data Elements using wildcards (*) and advanced options: • Long name- • *education* • Permissible Value- • 1st grade • Workflow Status- • RELEASED
Lesson 3: Searching for Metadata Successful Searching • Best • Search by specific Public ID • Recommended • Use a combination of search fields with wildcards (*) • Use the Advanced Search feature • If the results are not what you need, remove one criteria (i.e. permissible value)
Lesson 3: Searching for MetadataNot So Successful Searching Things to consider if you are not getting the results you expect: • Word order counts • *Birth Date* yields different results than *Date Birth* for date of birth • Asterisk placement will also yield different results • *Adverse Event* - 1154 Results • *Adverse Event – 21 Results • Adverse Event* - 476 Results • *Adverse*Event* - 1156 Results • Adverse Event – 5 Results
Lesson 3: Searching for MetadataSearching with the Curation Tool Curation Tool Search • Use when: • You can’t find what you’re looking for in the CDE Browser • You are searching for other metadata • Data Element Concepts • Value Domains • Object Class, Property • Your results list can be Data Element OR other administered items • You are looking for “associated” administered items (using the “Get Associated” search feature)
Lesson 3: Searching for MetadataSearch for a Specific Administered Item In the Curation Tool, you can search for most types of Administered Items (Data Element, Data Element Concept, Value Domain Object Class, Property, etc. Unlike the CDE Browser your results list in the Curation Tool are the administered item you are searching for. For example, if you search for a Value Domain your results list is all Value Domains.
A real life example Hi Dianne, I tried to reuse a VD 2017177 (Finding Result Status) for the Question below. I need to add “Unknown” as a PV. Would you be able to help? Thanks, Lady GaGa
Lesson 3: Searching for MetadataReview… • As a metadata curator, what are the tools you will use to search for metadata in the caDSR? • What is the best set of Data Elements for reuse? • What is one unique search capability in the Curation Tool?
Lesson 4: Evaluating Data Elements for Reuse In this lesson you will learn how to: • Identify three attributes of a well-formed data element • Evaluate data elements for reuse