180 likes | 346 Views
Best Practices for Curating Measures in caDSR. Using UML Models and CRFs Mary Cooper, SAIC July, 2010. Overview. Measures in Population Science and Clinical Trials Curating Questionnaires Approaches Perceived Stress Scale Naming Conventions Classifications Classes Attributes
E N D
Best Practices for Curating Measures in caDSR Using UML Models and CRFs Mary Cooper, SAIC July, 2010
Overview • Measures in Population Science and Clinical Trials • Curating Questionnaires • Approaches • Perceived Stress Scale • Naming Conventions • Classifications • Classes • Attributes • Value Domains • Definitions • Other Best Practices • Summary
Measures in PopSci and CTMS • Measures in Clinical Trials • Included as part of CRFs • Quality of Life • MMQL • Smoking/Alcohol • Demographics • Behavioral and Social Science Measures • Questionnaires and Surveys • National Health Interview Survey • Health Information National Trends Survey • Subjective Numeracy Scale • Perceived Stress Scale • FACT-G
Curating Questionnaires Approach Curation of CRFs and Questionnaires follows the business process • Analyze source document • Choose appropriate curation method • Create working spreadsheet or develop UML model • Search caDSR for existing metadata • Identify new items for curation • Consult with SME to vet metadata • Register metadata in caDSR Unique considerations for Questionnaires
Perceived Stress Scale • Curation challenges • Capturing concepts • Respecting validity/reliability • Definitions • Likert Scales in value domains • Value meaning concepts • Data Types
Package Naming Convention (caDSR Classification Schemes) • Packages = Class Scheme Items in caDSR • Follow Java package naming conventions; • Hierarchical and lowercase • Naming pattern with levels in the hierarchy separated by periods (.) • Best Practice: • gov.nih.nci.dccps.measurename_measureaccronym
Classes • Attributes • Each class must have an attribute for “id” with data type “string” • Class Naming Conventions • The class name should be the measure name (ex., SubjectiveNumeracyScale) • If there are sub scores or total scores a separate class should be added to capture these using measure name appended with “_Score” (ex., subjectiveNumeracyScale_Score) • Class Definitions • Use the descriptions of the scales from GEM as the class definitions
Attributes • Attribute Naming Conventions • Capture as much detail as possible • Respect the order of words in the questions • Interpretation of the words considered slang is acceptable • “In the past month, how often have you felt that you were on top of things?” • controlResponsibilitiesPastMonthFrequencyScore • Use “Score” for Likert scales • Add qualifier to describe what the score measures • Include time periods
Attributes Attribute Definitions • Process BPs • Include information from the question and instructions • Use SIW to insert the definitions • Question definition templates • A person’s stated perception of (question intent), using a X-point Likert scale. • Total score definition templates • The (mean or sum) of all responses by an individual to the “measure name” Scale. • Sub score definition templates • The (mean or sum) of all responses by an individual to the “domain name” domain of the “measure name” Scale
Value Domains • Scale • Definition in caDSR • “An ordered reference standard used to measure incremental changes.” • Likert scale • Alternate definition in PopSci • A set of questions that have been validated as reflecting some abstraction • Score • A numeric value associated with a respondent’s answer to a question or set of questions. • Best Practices • Attribute names end in score • Local value domain classes end in scale • Attributes capturing data about the measure have both
Value Domains • Naming Conventions • Use concepts in attributes to describe local value domains • Represent the Likert scale as “_XPointScale” using “Scale” as the primary representation term and “X” as the number of items in the scale • HowGood_6PointScale • Describing Likert scale value meanings • Not at all good = 1 2 3 4 5 = Extremely good • Values are not explicitly described between the endpoints • Define the value meaning in relationship to the scale • 2 = A 2 on a five-point scale from not at all good to extremely good • Data Types • Values represented as numbers in Likert scales • Use “int” (java.int) as the data type in EA
Other Best Practices • Associations • Associations should be created between the “question” class and the “score” class using a one to one relationship. Name each end using the class name • Reference Documents • Scoring algorithm should be included in the reference document or derived variable definition.
Summary • Touch points • Measures used in PopSci and Clinical Trials • UML model and manual curation methods • Shared domain expertise • CTMS process for caDSR curation • PopSci vocabulary/concepts/definitions • BPs for naming/defining/describing caDSR metadata • Next Steps • Document BPs for use by caDSR users
Acknowledgements • Working Group Leads • Jessica Bondy • Mike Collins • Hua Min • Members • Sana Ahmed • Nancy Avis • Mary Cooper • Paul Courtney • Richa Gandhi • Cindy Helba • Bob Lanese • Rick Moser • Riki Ohira • Dianne Reeves
Resources/References • caBIG Training Portal • https://cabig.nci.nih.gov/training/?pid=primary.2006-07-07.4911641845&sid=secondary.2006-10-24.0611379107&status=True • https://wiki.nci.nih.gov/display/COREtraining/caCORE+Training+Wiki • caCORE SDK Programmer’s Guide v4.2.1, Chapter 10 • https://gforge.nci.nih.gov/docman/index.php?group_id=148&selected_doc_group_id=5499&language_id=1 • https://gforge.nci.nih.gov/docman/view.php/148/20910/caCore_SDK_4.2.1_ProgrammersGuide_rv1.pdf • VCDE Presentation, “What Do Forms Curators Do?” • https://gforge.nci.nih.gov/docman/index.php?group_id=357&selected_doc_group_id=5793&language_id=1 • GForge UML Model Projects • http://gforge.nci.nih.gov/frs/?group_id=64 • DRAFT Best Practices for Curating Measures • https://gforge.nci.nih.gov/docman/index.php?group_id=623&selected_doc_group_id=5855&language_id=1
Comments/Questions • Thank you!