1 / 46

Chapter 6: Selecting Measurement Instruments

Chapter 6: Selecting Measurement Instruments. Objectives

aquene
Download Presentation

Chapter 6: Selecting Measurement Instruments

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 6:Selecting Measurement Instruments • Objectives • State the relation between a variable and a construct, and distinguish among categories of variables (e.g., categorical and quantitative; dependent and independent) and the scales to measure them (e.g., nominal, ordinal, interval, and ratio). • Define measurement, and describe ways to interpret measurement data.

  2. Selecting Measurement Instruments Objectives • Describe the types of measuring instruments used to collect data in qualitative and quantitative studies (e.g., cognitive, affective, and projective tests). • Define validity, and differentiate among content, criterion-related, construct, and consequential validity.

  3. Selecting Measurement Instruments Objectives • Explain how to measure reliability, and differentiate among stability, equivalence, equivalence and stability, internal consistency, and scorer/rater reliability. • Identify useful sources of information about specific tests, and provide strategies for test selection. • Provide guidelines for test construction and test administration.

  4. Data & Constructs • Data are the pieces of information you collect and use to examine your topic. • You must determine what type of data to collect. • A construct is an abstraction that cannot be observed directly but is invented to explain behavior. • e.g., intelligence, motivation, ability

  5. Constructs & Variables • Constructs must be operationally- defined to be observable and measurable. • Variables are operationally-defined constructs. • Variables are placeholders that can assume any one of a range of values. • Variables may be measured by instruments.

  6. Measurement Scales • The measurement scale is a system for organizing data. • Knowing your measurement scale is necessary to determine the type of analysis you will conduct.

  7. Measurement Scales • Nominal variables describe categorical data. • e.g., gender, political party affiliation, school attended, marital status • Nominal variables are qualitative. • Quantitative variables range on a continuum with ordinal, interval, and ratio variables.

  8. Measurement Scales • Ordinal variables describe rank order with unequal units. • e.g., order of finish, ranking of schools or groups as levels • Interval variables describe equal intervals between values. • e.g., achievement, attitude, test scores

  9. Measurement Scales • Ratio variables describe all of the characteristics of the other levels but also include a true zero point. • e.g., total number of correct items on a test, time, distance, weight

  10. Independent & Dependent Variables • Dependent variables are those believed to depend on or to be caused by another variable. • Dependent variables are also called criterion variables. • Independent variables are the hypothesized cause of the dependent variable. There must be at least two levels of an independent variable. • Independent variables are also called an experimental variables, manipulated variables, or treatment variables.

  11. Characteristics of Instruments • There are three major ways for researchers to collect data. • A researcher can administer a standardized test. • e.g., an achievement test • A researcher can administer a self-developed instrument. • e.g., a survey you might develop • A researcher can record naturally-occurring events or use already available data. • e.g., recording off-task behavior of a student in a classroom

  12. Instruments • Using standardized instruments takes less time than developing an instrument. • With standardized instruments, results from different studies that use the same instrument can be compared. • At times researchers may need to develop their own instruments. • To effectively design an instrument one needs expertise and time.

  13. Instruments • Tests are a formal systematic procedure for gathering information about people. • Cognitive characteristics (e.g., thinking, ability) • Affective characteristics (e.g., feelings, attitude)

  14. Instruments • A standardized test is administered, scored, and interpreted the same way across administrations. • e.g., ACT or SAT or Stanford Achievement test

  15. Instruments • Assessment refers to the process of collecting, synthesizing, and interpreting information, including data from tests as well as from observations. • Formal or informal • Numerical or textual • Measurement is the process of quantifying or scoring assessment information. • Occurs after data collection

  16. Instruments • Qualitative researchers often use interviews and observations. • Quantitative researchers often use paper and pencil (or electronic) methods. • Selection methods: The respondent selects from possible answers (e.g., multiple choice test). • Supply methods: The respondent has to provide an answer (e.g., essay items).

  17. Instruments • Performance assessments emphasize student process and require creation of a product (e.g., completing a project).

  18. Interpreting Instrument Data • Raw Score • Number or point value of items correct (e.g., 18/20 items correct). • Norm-referenced scoring • Student’s performance is compared with performance of others (e.g., grading on a curve).

  19. Interpreting Instrument Data • Criterion-referenced scoring • Student’s performance is compared to preset standard (e.g., class tests). • Self-referenced scoring • How individual student’s scores change over time is measured (e.g., speeded math facts tests).

  20. Types of Instruments • Cognitive tests measure intellectual processes (e.g., thinking, memorizing, calculating, analyzing). • Standardized tests measure individual’s current proficiency in given areas of knowledge or skill. • Standardized tests are often given as a test battery (e.g., Iowa test of basic skills, CTBS).

  21. Types of Instruments • Diagnostic tests provide scores to facilitate identification of strengths and weaknesses (e.g., tests given for diagnosing reading disabilities). • Aptitude tests measure prediction or potential versus what has been learned (e.g., Wechsler Scales).

  22. Affective Instruments • Affective tests measure affective characteristics (e.g., attitude, emotion, interest, personality). • Attitude scales measure what a person believes or feels. • Likert scales measure agreement on a scale. • Strongly agree, Agree, Undecided, Disagree, Strongly disagree

  23. Affective Instruments • Semantic differential scales require the individual to indicate attitude by position on a scale. • Fair Unfair 3 2 1 0 -1 -2 -3 • Rating scales may require a participant to check the most appropriate description. • 5=always; 4=almost always, 3=sometimes… • The Thurstone Scale & Guttman Scales are also used to measure attitudes.

  24. Additional Inventories • Interest inventories assess personal likes and dislikes (e.g., occupational interest inventories). • Values tests assess the relative strength of a person’s values (e.g., Study of Values instrument).

  25. Additional Inventories • Personality inventories provide participants with statements that describe behaviors characteristic of given personality traits and the participant answers each statement (e.g., MMPI). • Projective tests were developed to eliminate some of the concerns with self-report measures. These tests are ambiguous so that presumably the respondent will project true feelings (e.g., Rorschach).

  26. Criteria for Good Instruments • Validity refers to the degree that the test measures what it is supposed to measure. • Validity is the most important test characteristic.

  27. Criteria for Good Instruments • There are numerous established validity standards. • Content validity • Criterion-related validity • Concurrent validity • Predictive validity • Construct validity • Consequential validity

  28. Content Validity • Content validity addresses whether the test measures the intended content area. • Content validity is an initial screening type of validity. • Content validity is sometimes referred to as Face Validity. • Content validity is measured by expert judgment (content validation).

  29. Content Validity • Content validity is concerned with both: • Item validity: Are the test items measuring the intended content? • Sampling validity: Do the items measure the content area being tested? • One example of a lack of content validity is a math test with heavy reading requirements. It may not only measure math but also reading ability and is therefore not a valid math test.

  30. Criterion-Related Validity • Criterion-related validity is determined by relating performance on a test to performance on an alternative test or other measure. • Correlation coefficients are used to determine relative validity.

  31. Criterion-Related Validity • Two types of criterion-related validity include: • Concurrent: The scores on a test are correlated to scores on an alternative test given at the same time (e.g., two measures of reading achievement). • Predictive: The degree to which a test can predict how well a person will do in a future situation, e.g., GRE, (with predictor represented by GRE score and criterion represented as success in graduate school).

  32. Construct Validity • Construct validity is the most important form of validity. • Construct validity assesses what the test is actually measuring. • It is very challenging to establish construct validity.

  33. Construct Validity • Construct validity requires confirmatory and disconfirmatory evidence. • Scores on tests should relate to scores on similar tests and NOT relate to scores on other tests. • For example, scores on a math test should be more highly correlated with scores on another math test than they are to scores from a reading test.

  34. Consequential Validity • Consequential validity refers to the extent to which an instrument creates harmful effects for the user. • Some tests may harm the test taker. • For example, a measure of anxiety may make a person more anxious.

  35. Validity • Some factors that threaten validity include: • Unclear directions • Confusing or unclear items • Vocabulary or required reading ability too difficult for test takers • Subjective scoring • Cheating • Errors in administration

  36. Self-Report Instruments There are some concerns with data derived from self-report instruments. • One concern is response set, or the tendency for a participant to respond in a certain way (e.g., social desirability). • Bias may also play a role in self-report instruments (e.g., cultural norms).

  37. Reliability • Reliability refers to the consistency of an instrument to measure a construct. • Reliability is expressed as a reliability coefficient based upon a correlation. • Reliability coefficients should be reported for all measures. • Reliability affects validity. • There are several forms of reliability.

  38. Reliability • Test-Retest (Stability) reliability measures the stability of scores over time. • To assess test-retest reliability, a test is given to the same group twice and a correlation is taken between the two scores. • The correlation is referred to Coefficient of Stability.

  39. Reliability • Alternate forms (Equivalence) reliability measures the relationship between two versions of a test that are intended to be equivalent. • To assess alternate forms reliability, both tests are given to the same group and the scores on each test are correlated. • The correlation is referred to as the Coefficient of Equivalence.

  40. Reliability • Equivalence and stability reliability is represented by the relationship between equivalent versions of a test given at two different times. • To assess equivalence and stability reliability, first one test is given, after a time a similar test is given, and the scores are correlated. • The correlation is referred to as the Coefficient of Stability and Equivalence.

  41. Reliability • Internal Consistency reliability represents the extent to which items in a test are similar to one another. • Split-half: The test is divided into halves and a correlation is taken between the scores on each half. • Coefficient alpha and Kuder-Richardson measure the relationship between and among all items and total scale of a test.

  42. Reliability • Scorer and rater reliabilities reflect the extent to which independent scorers or a single scorer over time agree on a score. • Interjudge (inter-rater) reliability: Consistency of two or more independent scorers. • Intrajudge (intra-rater) reliability: Consistency of one person over time.

  43. Reliability • Standard Error of Measurement is an estimate of how often one can expect errors of a given size in an individual’s test score. SEm=SD * SQT 1-r Sem=Standard error of measurement SD=Standard deviation of the test scores r=the reliability coefficient

  44. Selecting a Test Once your have defined the purpose for your study: • Determine the type of test that you need. • Identify and locate appropriate tests. • Determine which test to use after a comparative analysis.

  45. Selecting a Test There are several locations where one can obtain information and reviews about available tests. These are a good place to start when selecting a test. • MMY: The Mental Measurements Yearbook is the most comprehensive source of test information • Pro-Ed Publications • ETS Test Collection Database • Professional Journals • Test publishers and distributors

  46. Selecting a Test When comparing tests you have located and you are deciding which to use attend to each of the following: • First, examine validity. • Next, consider reliability. • Consider ease of test use. • Assure participants have not been previously exposed to the test. • Assure sensitive information is not unnecessarily included.

More Related