1 / 15

Different purposes for a measurement hold implications for the way you write the questions

Descriptive measure: Broad ranging. Goal = to classify groups Themes of interest to people in general (“quality of life”, etc.), or issues of public concern Should it emphasize modifiable themes? Do you want a profile or an index?. Evaluative measure:

irma
Download Presentation

Different purposes for a measurement hold implications for the way you write the questions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Descriptive measure: Broad ranging. Goal = to classify groups Themes of interest to people in general (“quality of life”, etc.), or issues of public concern Should it emphasize modifiable themes? Do you want a profile or an index? Evaluative measure: Content should be tailored to the intervention; usually not comprehensive Must be sensitive to change produced by intervention Focused & fine-grained: select indicators that sample densely from relevant level of severity; unidimensional Consider whether you need to have a summary score Different purposes for a measurement hold implications for the way you write the questions

  2. Side Bar: Two rival conceptual models of “population health” • Population health = sum total of individual health. • Idiographic: focus is on individual • No attention to dynamics, complexity, etc. • Factors “outside the skin” seen as determinants, not aspects of population health. • Responsibility for health lies principally with individuals, so a survey of population health need only consider final outcomes. • Emergent model. • Here “population” implies more then individuals within it: community health. • Ability of the society / community / population to respond to challenges is central to defining its health. • Externalities that influence individual health seen as aspects of population health: social capital, the existence of health protection legislation, acceptance of “the greater good,” etc. • Summary measures record outcomes of this process, but the core population health dynamics actually lie upstream. Indicators of population health need to be broader than just measures of individuals in the population.

  3. Parsimony vs. Sensitivity & Specificity These are in tension! Need for brief instrument implies: • If goal is to have broad coverage of domains (as in a descriptive measure), there can only be few items in each domain • To achieve breadth within a domain in few items, we need to use generic items (e.g., “can you cut your toenails?”) • This can achieve sensitivity as a screen, but cannot classify type of condition and specificity may be low • Generic items also lose interpretability and unidimensionality • Need to think through whether we care whether it’s knee pain, or muscle weakness, or balance that limits walking ability. Do you want a descriptive or a diagnostic instrument?

  4. Unidimensionality • IRT goal of unidimensionality is hard to apply in health measurement. Some topics are hierarchical; others (symptoms of depression) are not. Depression or anxiety scales often do not meet IRT unidimensionality criterion • Unidimensionality is chiefly important for clinical interpretation & maybe evaluation; not the issue in a survey. Surveys focus on how bad it is, not what it is • If instrument will be scored as an index, unidimensionality becomes irrelevant as all the items are combined and it’s impossible to visualize the person’s disability anyway • There is an inherent tension between using generic, screening-type items (e.g., IADLs) and unidimensionality • Many functions involve more than one body system (e.g., recognizing a face across street), so are not unidimensional

  5. The Time Frame Debate • Some questions refer to “present”; can use to get prevalence, but will not tell incidence, or duration of condition • Duration requires additional questions, as does change • Width of time window not very important: average is just calculated over a shorter or longer time • Suggest one week (to capture week-ends, etc) or else “yesterday” (as today is incomplete) Sampling window Problem! A B C Change only captured if additional questions asked,so can’t distinguish A from B

  6. Time Window & Response Shift • Larger time windows, and phrasing in terms of “usual” can face issue of response shift (people recalibrate what they see as “normal”) • “Usual” phrasing seems most problematic: may miss chronic disabilities (cf. criticism of GHQ); cannot record incidence, maybe not even prevalence Response Shift: Perception of “usual” function Actualtrajectory Delay between functional change & perception varies according to a rangeof factors

  7. Continuous States vs. Episodic Events • Mobility limitations often endure. By contrast, pain, anxiety or marital disputes are commonly episodic • Averaging over broad time-window can be an issue for the episodic events, because • Averaging episodes raises issue of frequency vs. intensity of events (see next slide) • In general, time & averaging is less of an issue for capacity than for performance, because capacity is enduring; performance may fluctuate • However, the notion of capacity is hard to apply to pain, anxiety and depression (in which wording a question in capacity terms tends to approximate performance).

  8. Combining Severity & Frequency • Risk of trying to do too much. The problem of summarizing frequency & severity grows with increasing length of retrospection. If “yesterday” is used, you can only ask about severity • The term “level” (“How would you describe your level of anxiety?”) is unclear: presumably some combination of severity & frequency of episodes, but how does the respondent combine these? • Options: “Please judge the overall amount of pain, considering both intensity and frequency, you have experienced …”Simpler: “How bad was your pain?” Mild, moderate, severe…” versus ? time

  9. Response options: Frequency vs. Difficulty • For chronic conditions, intensity responses are obviously more appropriate • For fluctuating conditions (insomnia, depression), frequency seems most appropriate • If brief recall periods, use intensity responses • For longer-term recall, use frequency phrasing • Also, need to decide on relative vs. absolute responses. E.g., “do you have difficulty keeping up with people your own age?” • Likewise, do we specify “level ground” for walking, or “where you live.” The first is close to disability and may not be relevant to them, the second (handicap) will be relevant but may make direct comparisons difficult.

  10. Discuss Structure of Overall Instrument • Can it be made dynamic? Item banking; tailored responses; computer administration or using skip patterns. Some examples: • Cella: http://outcomes.cancer.gov/conference/irt/cella_et_al.pdf • www.amIhealthy.com • Ware JE et al. Item banking and the improvement of health status measures. Quality of Life Newsletter 2004; Fall (Special Issue):2-5. • Bjorner JB et al. Using item response theory to calibrate the Headache Impact Test (HIT) to the metric of traditional headache scales. Qual Life Res 2003; 12:981-1002

  11. Prosthetics, Analgesics, etc. (points 20-25) Rocks & hard places… • Without aids approximates impairment; with aids = disability • But this distinction is hard to make in ICF: ‘activity’ and ‘participation’ both sound like performance rather than capacity • Not quite clear why eye glasses are singled out for inclusion, while walking sticks apparently are not • Asking an amputee about mobility without his prosthesis seems artificial (point #21) • Likewise, if they are taking effective analgesics, it’s hard for them to report pain without (points #24 & 25) • If purpose is to indicate health states in this nation, suggest the approach of “using any aids you normally use.” • Suggest not relying on use of analgesics as way to indicate severity (point #22), because availability will vary greatly

  12. Visual Analogue Scales • In clinical settings, VAS, NRS pain ratings intercorrelate highly. Verbal scales correlate with both, but less closely • VAS is visual, so implies use of paper & pencil • If used in telephone format, VAS reduces to a NRS, so why not just use NRS? • Less educated and older patients appear to find NRS easier than VAS, so these have been endorsed for use in cancer trials (Moinpour et al., J Natl Cancer Inst 1989; 81:485-495) • The FLIC began with VAS, but changed to 6-pt NRS • However, the VAS can be very responsive (e.g., Hagen et al, J Rheumatol 1999; 26:1474-1480). But do we need responsiveness? • Many alternative formats, including graphic rating scale (Dalton et al, Cancer Nurs 1998; 21:46-49) or box scale (Jensen et al, Clin J Pain 1998; 14:343-349). See also Cella & Perry, Psychol Rep 1986; 59:827-833, and Scott & Huskisson, Pain 1976; 2:175-184.

  13. Anxiety & Depression • Trying to discriminate between these may focus attention on the trees rather than the forest • Unitary theory sees A & D as expressions of the same pathology; the opposing perspective sees them as fundamentally different, while the compromise is to view them as having common roots but different expressions (Brown et al, J Abnorm Psychol 1998; 107:179-192). • Anxiety suggests arousal and an attempt to cope with a situation; depression suggests lack of arousal and withdrawal: the NE and SE quadrants of the diagram (next slide) • An anxious person might say “That terrible event is not my fault but it may happen again, and I may not be able to cope with it but I’ve got to be ready to try.” A depressed person might say “That terrible event may happen again and I won’t be able to cope with it, and it’s probably my fault anyway so there’s really nothing I can do.” (Barlow DH. The nature of anxiety: anxiety, depression, and emotional disorders. In: Rapee RM, Barlow DH, eds. Chronic anxiety: generalized anxiety disorder and mixed anxiety-depression. New York: Guilford, 1991: 1-28)

  14. A circumplex model of affect High positive affect Anxiety active,elated,excited Strong engagement Pleasantness content,happy,satisfied aroused,astonished,concerned High negative affect Low negative affect relaxed,calm, placid distressed, fearful, hostile sad, lonely, withdrawn inactive,still,quiet sluggish,dull,drowsy Disengagement Unpleasantness Low positive affect Depression

  15. Emotions & Affect: scattered thoughts • How to fit affect within capacity / performance distinction? Many anxiety questions use either state or performance wordings (“How severe was you anxiety?” or “Did anxiety limit your daily activities?”) • Why try to distinguish anxiety and depression? • Not completely clear why we need both positive and negative affect (point #68): if time frame correctly chosen, they should not be orthogonal • Phrase such as “upset or distressed” may capture general affect quite well • Stress may also be pertinent: cf. DASS of Lovibond (Manual for the Depression Anxiety Stress Scales. Sydney: Psychology Foundation, 1995)

More Related