Standardized Scales

Standardized Scales

Standardization • Use of identical procedures to collect, score, interpret, and report results of a measure • Assures that differences over time or among different people are due to the variable being measured and not to different measurement procedures

What are Standardized Scales? • Set of uniform procedures to collect, score, interpret, and report numerical results • Usually have norms and empirical evidence of reliability and validity • Typically include multiple items aggregated into one or more composite scores • Frequently used to measure constructs

Construct • Complex concept (e.g., intelligence, well-being, depression) • Inferred or derived from a set of interrelated attributes (e.g., behaviors, experiences, subjective states, attitudes) of people, objects, or events • Typically embedded in a theory • Oftentimes not directly observable but measured using multiple indicators

Evaluating and Selecting Standardized Scales • Purpose • Reference populations and normative groups • Reliability • Validity • Practical considerations

Purpose • Identify whether or not a client has a significant problem • Measure and monitor your client’s outcomes to determine if your client is making satisfactory progress

Reference Population Population of people for which a measure is intended and from which a normative group is sampled and norms are created

Normative Group • Representative sample of a reference population, used to estimate norms for that population and, more generally, used to develop and test standardized measures • Also known as a “standardization group” or “standardization sample”

Reliability • Internal consistency reliability (coefficient alpha) (most important) • Interrater rater reliability (sometimes) • Test-retest reliability

Validity • Face • Content • Criterion • Construct • Sensitivity to change especially important

Practical Considerations • Time • Effort • Training • Cost • Availability • Acceptability (e.g., clients, practitioners, etc.)

Decisions, Decisions… • Who • Where • When • How often to collect outcome data

Who • Client • Practitioner • Relevant others • Independent evaluators

Where and When • Private, quiet, physically comfortable location • Complete at about the same time and under the same conditions on a regular basis

How Often • Regular, frequent, pre-designated intervals • Often enough to detect significant changes in the problem, but not so often that it becomes problematic • In general about once per week

Engage and Prepare Clients • Be certain the client understands and accepts the value and purpose of monitoring progress • Discuss confidentiality • Present measures with confidence • Don’t ask for info the client can’t provide

Engage and Prepare Clients (cont’d) • Be sure the client is prepared • Be careful how you respond to information • Use the information that is collected • Be careful how you respond to information • Use the information that is collected

Administering, Scoring, and Interpreting Standardized Scales • Score, scoring formula, composite score • Unidimensional and multidimensional scales • Cut scores • Reverse-worded items • Reliable change, reliable improvement, reliable deterioration • Clinically significant improvement • Expected treatment response

Score • Generic term for a number derived from a measure that represents the quantity or amount of an attribute or observation (e.g., number of times a behavior is observed, value obtained from a standardized scale) • Interpret in context of all available quantitative and qualitative information

Scoring Procedure by which data from a measure are used to produce a score (e.g., number of times a behavior occurs or value on a standardized scale) or category (e.g., diagnostic category)

Scoring Formula A mathematical rule by which data from a measure are used to produce a score (e.g., sum or average of responses to items on a multi-item standardized scale)

Composite Score Score that combines results from two or more related items or other measures using a specified formula (e.g. percentage of items answered correctly on a statistics test)

Unidimensional Scale Scale that measures a single attribute or construct (e.g., depression). (Contrast with multidimensional scale.)

Multidimensional Scale Scale that measures two or more distinct but related attributes or constructs, and measures of the different attributes or constructs are referred to as “subscales”

Cut Scores • Specific predetermined numerical values along a continuum of scores • Used to separate people into categories with distinct substantive interpretations (e.g., clinically depressed or not) • Used to make decisions (provide treatment for depression or not) • Only as good as the normative sample(s) on which it is derived • Interpret in context of all available quantitative and qualitative information

Reverse-Worded Item Item for which smaller numbers indicate a higher score on the measured variable because the item is worded to mean the opposite of the measured variable

Reliable Change Change in a score from one time to another that is more than expected just from random measurement error • Clinical significance.xls

Reliable Improvement Improvement in a score from one time to another that is more than expected just from random measurement error

Reliable Deterioration Deterioration in a score from one time to another that is more than expected just from random measurement error

Clinically Significant Improvement Change that occurs when a client’s measured functioning on a standardized scale is: • In the dysfunctional range before intervention (e.g., greater than 5 on the QIDS-SR) • In the functional range after intervention (e.g., 5 or below on the QIDS-SR) • Change is reliable

Clinically Significant Improvement (cont’d) • Interpret in context of all available quantitative and qualitative information • Does not guarantee a meaningful change in a client’s real-world functioning or quality of life • Only as good as the normative sample(s) on which it is derived • Does not speak to the question of whether it was your intervention or something else that caused the change

Expected Treatment Response • Session-by-session progress is determined in comparison to normative data from ongoing responses to treatment of thousands of clients • Feedback used in real time to monitor client progress and modify services as needed to reduce treatment failures and increase overall effectiveness

Global Rating Single rating based on a rater’s integration of information about numerous factors (e.g., global rating of change, improvement, or social functioning)

Single-Item Global Standardized Scales • Global Assessment of Functioning (GAF) • Children’s Global Assessment Schedule (CGAS) • Social and Occupational Functioning Assessment Scale (SOFAS) • Global Assessment of Relational Functioning (GARF)

Potential Advantages of Standardized Scales • Pretested for reliability and validity • Structured, so information less likely to be missed • Can be used to compare individual functioning to normative group functioning • Can be efficient and simple to use

Cautions in the Use of Standardized Scales • May not measure concept suggested by scale name • Different measures of the same concept may not be equivalent • Sometimes limited information about reliability and validity • Concepts as measured may not be completely relevant to individual clients

Resources • Compendiums of measures • See Appendix B • Web measurement resources • See Appendix B

Standardized Scales

Standardized Scales

Presentation Transcript

Standardized Curriculum

Standardized Testing

Scales

STANDARDIZED TESTS

Standardized Tests

Standardized Scales

Scales

Scales

. Scales

Standardized Testing

Standardized Testing

Standardized Testing

Scales

SCALES

Standardized Tests

SCALES

Scales

Standardized Testing

Standardized Testing

scales

NTEP SCALES-Selleton Scales

Scales