The Development of a Health Numeracy Measure NCI RO1 September, 2007-August, 2010

The Development of a Health Numeracy MeasureNCI RO1September, 2007-August, 2010 Marilyn M. Schapira Kathlyn Fletcher Mary Ann Gilligan Prakash Laud Toni King Cynthia Walker: UWM Elizabeth Jacobs: Rush Medical Center Pam Ganschow: Rush Medical Center

Outline • Background and Specific Aims of Grant • Overview of Methods • Discussion of Item Response Theory

Background • The Institute of Medicine model of health literacy includes 4 constructs • Cultural and conceptual knowledge • Oral literacy (listening and speaking) • Print literacy (writing and reading) • Numeracy • Existing measures focus on print literacy • Numeracy measures do not assess all domains of numeracy in our empirically derived framework

Overall Purpose • A comprehensive and valid numeracy measure will advance the science of communication and decision making • Providing a tool to described the level and components of numeracy present in a given individual or population • Guide the development of tailored communication and patient education materials • Support the development of interventions to improve health numeracy skills

Why Numeracy is Important in Health Care • Increased comprehension of risk information • Increased accuracy of numeric risk perceptions • Understanding probabilities of therapeutic benefit and risks in the context of informed consent including enrollment in clinical trials • Informed and participatory decision making, use of decision-aids • Effective use of chronic disease management strategies; understanding concepts of efficacy, managing chronic diseases such as DM and CHF.

Theoretical Framework of Health Numeracy

Worry Anxiety Fear Suspicion Annoyance Confusion Hope Relief Trust Happiness Control Emotion/Affect Component to Communication with Numbers

Specific Aim #1 • To develop a measure of health numeracy named the Numeracy Understanding in Medicine Instrument (NUMi). • Based on an empirically derived framework of health numeracy • Cross-culturally equivalent across race (black and white) and ethnic (Hispanic and non-Hispanic) groups. • Developed using Item Response Theory

Specific Aim #2 • To establish the reliability and validity of the NUMi • Internal reliability and parallel-testing reliability • Content validity : expert panel review of construct, corresponding skills, and items generated • Construct validity: association of NUMi scores with levels of education and existing literacy measures • Criterion validity: association with the Medical Data Interpretation Test, the adoption of health behaviors, and perceived health.

Methods • Phase A: Generation of NUMi Items • Establish the content validity of the framework and generate items to measure health numeracy • Phase B: Testing of NUMi Items • Establish the psychometric properties and refine pool of items • Phase C: Validation of NUMi • Establish the reliability, construct and criterion validity of the NUMi in a cross-cultural sample

Phase A: Generate Items • Conduct 6 focus groups in a Hispanic population • Community Recruitment (Milwaukee and Chicago) • Mexican American • 2 in Spanish: No high school or GED degree • 2 in English: High school graduates • 2 in English: Some college or college degree • Goals of focus groups: • Ensure that framework of health numeracy has conceptual equivalence among the Hispanic/Latino segment of our target population • Evaluate the interpretation of terms to be used in discussions of health numeracy

Development of Test Specification Table • Identify elements of each domain to test and the number of items to test in each element. • Will guide the development of the instrument and ensure that the final NUMi will include representative samples of items from each domain. • The TST will be reviewed by the Expert Panel to support content validity of the measure.

Expert Panel • Michael Farrell MD -Communication • Elizabeth Hayes Ph.D.-UW Madison-Adult Education • Timothy Johnson, Ph.D., Survey Researcher, Cross-Cultural Research • Michael Paasch-Orlow MD, Health Literacy • Ramona Rodriquez MD, Bilingual Skills, Clinician, Hispanic Health

Item Development • 60 items to be developed by research team • Assembled into two parallel forms of 30 items each • Expert panel will be convened • Panel will review items and rate on following: • How well item reflects skill in TST • Item clarity and appropriateness of response scale • Revisions made based upon Expert Panel feedback • 48 cognitive interviews to be conducted and modifications made based upon feedback

Psychometric Testing Protocol • Recruitment of 1000 persons • Mix of White, Hispanic/Latino, Black • Community and clinical settings • Administer test items to groups of participants • Test psychometric properties using IRT • Validation • Existing literacy and achievements tests • Medical Data Interpretation Test • Patient Outcomes: Health protective behaviors and perceived health • Evaluate bias across gender, race/ethnicity groups

Justification for IRT • Limitations of Classical Test Theory • The use of item indices whose values depend on the particular group of examinees with which they are obtained (i.e., test difficulty). • Examinee ability estimates that depend on the particular choice of items selected for a test.

Item Response Theory Postulates • The performance of an examinee on a test item can be predicted by their degree of the latent trait, or ability • The relationship between item performance and ability, or trait, can be described by a monotonically increasing function called an item characteristics curve (ICC) • This function specifies that as the level of trait increases, the probability of a correct response increases.

Property of Invariance • Parameters that characterize an item do not depend on the ability distribution of the examinees • Parameters that characterize an examinee do not depend on the set of test items • When the IRT model fits the data, the same ICC curve is obtained for the test item regardless of the distribution of ability in the group of examinees used to estimate the item parameters

Assumptions • Unidimensionality • Only one ability is measured by the items that make up the test. • Local independence • Responses to any pair of items are statistically independent; the set of abilities measures makes up the complete “latent space”.

Models of IRT One-parameter model Difficulty of the item-beta Two-parameter model Difficulty-beta Discrimination-alpha Three-dimensional model Difficulty-beta Discrimination-alpha Guessing parameter-c parameter

IRT Two Parameter Model

Differential Item Functioning • A way to see if there is bias in the items between various groups (gender, race/ethnicity) that have equal amounts of the trait being measured. • Definition: an item demonstrates DIF if individuals having the same ability, but from different groups, do not have the same probability of getting the item right.

Item Information Function • The contribution of test items to ability estimation at points along the ability continuum. • In general, an item is most discriminating at its level of difficulty, beta. • Useful in developing tests if the fit of the data to the IRT model is good.

Test Construction • IRT offers a more powerful method of item selection than classical test theory • Items that are most useful (i.e., discriminating) in certain regions of the ability can be selected

Procedure for Test Construction • Develop a test specification table framework • Develop an item bank of potential items

Test Construction (continued) • Items are chosen that can discriminate best at varying degrees of difficulty • Decide on the shape of the desired test information function (broad ability test vs. criterion-referenced test) • Select items that will provide the desired discrimination at various levels of ability • Continue to select items until the desired test information function is obtained

Computerized Adaptive Testing • A test provides the most precise measurement when the difficulty of the test is matched to the ability level of the examinee • The ideal testing situation is to give every examinee a test that is adapted to ability level • In computerized adaptive testing (CAT), the sequence of items administered depends on the examinee’s performance on earlier items

Summary of Methods • Conduct focus groups • Modify framework to fit broader population • Support development of test items • Generate a bank of items • Investigative team, Expert Panel, Cognitive Interviews • Determine item and test characteristics using IRT in a large sample of participants • Conduct validation studies and differential item functioning • Future directions: Translation to Spanish instrument, Computer Adaptive Testing

The Development of a Health Numeracy Measure NCI RO1 September, 2007-August, 2010

The Development of a Health Numeracy Measure NCI RO1 September, 2007-August, 2010

Presentation Transcript

Mountain Goat Research Underway August – September 2007

Development of a National Literacy and Numeracy Policy

August 2007

NCI caBIG Project Update as of Friday August 6, 2010

NCI caEHR Project Update August 11, 2010

August-September 2010 –Volume 1, Issue 4

The NCI Approach To Drug Development

Rural Health Network Development Grantee Meeting August 2, 2010

The NCI Thesaurus: A Controlled Vocabulary Of NCI Functions

Project no: 2010-1-RO1-LEO05-07377

National Institutes of Health August 2010 teamscience.nih

Numeracy Development Workshop

AUGUST/SEPTEMBER

Implementing the Mental Health (Wales) Measure 2010 Phillip Chick

Project no: 2010-1-RO1-LEO05-07377

Development of a New Measure of Ethical Climate

August/September