260 likes | 362 Views
Quality Control in Survey Design: Evaluating a Rating Scale of Educators’ Attitudes Toward Differentiated Compensation. Shannon O. Sampson Kelly D. Bradley University of Kentucky. Background.
E N D
Quality Control in Survey Design: Evaluating a Rating Scale of Educators’ Attitudes Toward Differentiated Compensation Shannon O. Sampson Kelly D. Bradley University of Kentucky
Background • Surveys- most common example of self-reported data collection and one of the most popular research methodologies for graduate studies and published papers in education (Aiken, 1988; Babbie, 1992; Gay, 1981). • Even so, the efficiency and effectiveness of the instrument as a measurement tool is often overlooked or underemphasized.
“Operationalizing and then measuring variables are two of the necessary first steps in the empirical research process. Statistical analysis, as a tool for investigating relations among the measures, then follows. Thus, the interpretation of analyses can only be as good as the quality of the measures.” (Bond and Fox, 2001)
Objectives of Study • Utilize Rasch analysis to evaluate the quality of a survey instrument designed to measure educators’ attitudes about differentiated compensation • Employ a data-driven model for improving survey instrumentation
Assumptions with traditional rating scale data analysis • Each item contributes equally to the measure of the construct • Each item is measured on the same interval scale • Respondents have appropriately interpreted the directions • All items are written clearly such that only one interpretation is possible
However… • Items generally represent different amounts of a variable • Scales are ordinal, so categories are not necessarily spaced equally • Respondents often misinterpret directions • Items are often open to multiple interpretations
Furthermore… • Estimates for items depend on severity of respondents in sample • Estimates of item ratings cannot be compared across groups • Complete records required • Single standard error of measurement is produced for the composite of ratings
Rasch model • Probabilistic version of the scalogram • Parameters neither sample nor test dependent- missing data not problematic • Standard error estimates produced for each discrete raw score
Attitudes about differentiated compensation • Differentiated compensation: Range of incentives added to present compensation • Salary bonuses for teaching in critical shortage areas • Financial support for seeking advanced degrees • Participation in voluntary career advancement opportunities
Instrumentation and Sample • 10 KY school districts involved in differentiated compensation program pilot • University of Kentucky faculty constructed a pencil and paper survey instrument • Survey administered to four groups • Teachers (n = 438) • Mentor teacher- achievement coaches (n = 60) • Principals (n = 63) • Superintendents (n = 10)
“Prior to analysis, our preliminary ideas about items and persons we choose to study obligates us to form specific hypotheses about both items and persons.” (Wright & Stone, 2004)
Evaluating the quality of the instrument • Evaluate the coherence of the data (does a yardstick exist?) • Evaluate the rating scale structure (how accurate is the yardstick?) • Evaluate the individual items (can the yardstick be refined?)
1. Evaluate the coherence of the data (do I have a yardstick?) • Have items been keyed as intended? • Are there problems with the data coding? • Are the items measuring only one variable? • Are all items pointing in the same direction?
Kentucky Department of Education Differentiated Compensation Survey
2. Evaluate the rating scale structure (how accurate is my yardstick?) • Do the mean measures for the responses in each category increase as the categories step up the scale in the direction defined as “more”? • Do the categories fit the expectations of the model? • How are the respondents using the rating scale?
Do the mean measures for the responses in each category increase as the categories step up the scale in the direction defined as “more”? Empirical item-category measures for administrators -2 -1 0 1 2 3 4 5 |-------+-------+-------+-------+-------+-------+-------| NUM ITEM | 12 3 4 | 17 Non-cert Ts should pay all... | 1 23 4 | 7 linking teacher salary to... | 1 2 3 4 | 11 Students' standardized tes... | 1 2 3 4 | 2 DC would not enhance the p... | 1 2 34 | 18 When non-cert Ts are hired... | 1 2 3 4 | 24 If Ts receive a DC bonus...deserve it | 3 24 | 33 There is too much peer pre... | 1 2 3 4 | 25 The size of the salary bon... | 342 | 19 cert Ts are more effective... | 1 2 3 4 | 8 Ts receiving differentiate... | 1 2 3 4 | 1 DC will attract better qua... | 3 2 4 | 32 Ts believe their school stresses excellence... | 3 4 | 31 Ts feel a sense of ownership in student lrng | 3 4 | 28 Ts identify with their sch... | 3 2 4 | 29 Ts take pride in being a p... | 3 4 | 30 Ts feel a sense of ownership in school | 3 4 | 27 Improving knowledge and sk... | 3 4 | 34 Ts are encouraged to make suggestions | 143 | 16 All Ts should be required... |-------+-------+-------+-------+-------+-------+-------| NUM ITEM -2 -1 0 1 2 3 4 5 1 2 2423227367 34442512 212 1 PERSONS T S M S T
How are the respondents using the rating scale? Differentiated Compensation Survey- Administrators CATEGORY PROBABILITIES: MODES - Structure measures at intersections P ++---------+---------+---------+---------+---------+---------++ R 1.0 + + O | | B | | A |111 | B .8 + 111 + I | 11 444| L | 11 44 | I | 11 44 | T .6 + 11 3333333333 44 + Y | 1 33 3333 44 | .5 + 11 33 3344 + O | 1 33 44333 | F .4 + 11 33 44 33 + | 22222*2**222 44 33 | R | 2222 *1 222 44 333 | E | 222 33 1 222 44 33 | S .2 + 222 33 11 22244 3+ P | 2222 333 111 444222 | O |2 333 ***4 22222 | N | 3333333 4444444 111111 2222222 | S .0 +****444444444444444444 1111111111111**********+ E ++---------+---------+---------+---------+---------+---------++ -3 -2 -1 0 1 2 3 PERSON [MINUS] ITEM MEASURE Strongly disagree Strongly agree Agree Disagree
Do the categories fit the expectations of the model? +------------------------------------------------------------------ |CATEGORY OBSERVED|OBSVD SAMPLE|INFIT OUTFIT||STRUCTURE|CATEGORY| |LABEL SCORE COUNT %|AVRGE EXPECT| MNSQ MNSQ|| MEASURE | MEASURE| |-------------------+------------+------------++---------+--------+ | 1 1 93 4| -.76 -.85| 1.12 1.39|| NONE |( -2.54)| 1 | 2 2 220 9| .32 .29| .97 .92|| -1.12 | -.96 | 2 | 3 3 982 40| 1.31 1.36| 1.00 .96|| -.67 | .70 | 3 | 4 4 1145 47| 2.60 2.57| 1.01 1.00|| 1.79 |( 2.95)| 4 |-------------------+------------+------------++---------+--------+ |MISSING 8 0| .78 | || | | +------------------------------------------------------------------ Category fit is acceptable Respondent use of scale is almost dichotomous administrators
3. Evaluate the individual items (how can I refine my yardstick?) • Do the items fall into the hypothesized hierarchy? • Do the items spread evenly across the intended range of the instrument? • Do any items clump at a point on the scale? • Which items are misfitting? Why might they be misfitting?
Difficult to endorse Do the items fall into the hypothesized hierarchy? • Students’ standardized test scores would improve if a differentiated compensation program were adopted • Differentiated compensation would not enhance the positive relationship among teachers and administrators • A differentiated compensation program will positively affect teacher morale • Relations between administrative and instructional staff will be negatively affected if a differentiated compensation program is adopted • Differentiated compensation will have a negative impact on the morale of the teachers in the system • Differentiated compensation programs recognize teacher contributions to student learning • Differentiated compensation programs help recruit teachers who can improve student learning • A differentiated compensation program will result in a higher teacher retention rate • It is appropriate for teachers to receive bonuses to serve in rural schools classified as a “difficult assignment” or “hard-to-fill” position • The size of the salary bonus I could receive to become certified in a critical shortage area or “difficult assignments” must be large enough to motivate me. • A differentiated compensation program helps recruit better-qualified people to the teaching profession • A differentiated compensation program helps retain teachers in critical shortage areas • I identify with this school • All teachers should be required to be certified to teach • I take pride in being a part of this school • Improving teachers’ knowledge and skills enhances student learning • School districts should support teachers who are voluntarily advancing their careers • I feel a sense of ownership in student learning (Teacher hierarchy) Easy to endorse
MAP OF PERSONS AND ITEMS MEASURE | MEASURE <more> --------------------- PERSONS -+- ITEMS --------------------- <rare> 5 + 5 | | | | | 4 + 4 | | | | X | 3 X + 3 T| X XX |T X X | XXX | X S| 2 XXXXX + X 2 XXXXX | XXXX | XXXX | XXXX M|S XXXXXX | XX 1 XXXXXXXXX + XX 1 XXXX | XX XXXXX S| XXX X | XXXX XX | X | X 0 X T+M X 0 | XXXX | X X | X | | -1 + -1 | X |S X | X | | -2 + XXX -2 | XXX | | |T | -3 + -3 | | | | | -4 + -4 <less> --------------------- PERSONS -+- ITEMS ------------------<frequent> Do the items spread evenly across the intended range of the instrument?
Which items are misfitting? Why might they be misfitting? Do these items tap the construct?
These items may have multiple interpretations: In which content areas? On which standardized tests? For all teachers and all content areas? What is “large enough to motivate”?
Less variability than the model would predict Redundancy or High agreement among respondents
“The problem of measurement, and especially of attaining interval scales, is an extremely serious one for the social and behavioral sciences. It is unfortunate that in their search for quantitative methods, researchers sometimes overlook the question of level of measurement and tend to read quite unjustified meanings into their results… However, the core problem of level of measurement lies outside the province of mathematics and statistics.” (Hays, 1988)
Educational Importance • The education community will benefit by receiving better-informed results by collecting data using a more valid and reliable instrument. • Offers a sound methodology for evaluating the quality of the measurement instrument