310 likes | 406 Views
IE 486 Work Analysis & Design II. Instructor: Vincent Duffy, Ph.D. Associate Professor of IE Lecture 11 – Questionnaire Design & Evaluation Thurs. Feb. 22, 2007. Administrative. Briefly review QOTD answer sheet from L10 Questionnaire design & analysis
E N D
IE 486 Work Analysis & Design II Instructor: Vincent Duffy, Ph.D. Associate Professor of IE Lecture 11 – Questionnaire Design & Evaluation Thurs. Feb. 22, 2007
Administrative • Briefly review QOTD answer sheet from L10 • Questionnaire design & analysis • (in preparation for Lab 3 beginning Friday) • Lab 2 due in class next Tuesday 27th
QOTD- Lecture 11 • Q.1. Briefly discuss what is meant by criteria, measures & dependability. • Q.2 What methods are appropriate for evaluating questionnaire data? • Q.3 Should the same analyses be used for nominal/category data and continuous data?
Questionnaire design • QOTD • included in slides • Methods in Work Analysis & Design • Dependability of measures • Evaluating questionnaire data • Factor analysis • Internal consistency • Questionnaires • examples & scales
Q.1 Briefly discuss what is meant by criteria, measures & dependability This is related to methods of data collection & evaluation • A criterion is an evaluation standard that can be used as a surrogate or correlate of outcome measures…such as system effectiveness, human performance and attitudes. • Eg. For driving performance we previously considered stopping distance and time to lane change (or lane deviations)
Q.1 Briefly discuss what is meant by criteria, measures & dependability • Important aspects of criteria include: • Relative strengths/weaknesses of the data collection methods • Balance between costs of methods & available resources • (eg. Consider motion capture test bed). • Costs include money, time, personnel & expertise Relevance
Q.1 Briefly discuss what is meant by criteria, measures & dependability • Relative strengths/weaknesses of the data collection methods (dependability) • Today, aspects of criteria & methods of evaluation • Dependability is related to the validity & reliability • Brief intro to dependability and measures of validity & reliability – more in lab tomorrow • Important aspects of criteria include: • Relevance, Linearity and Homogeneity
Performance IQ Q.1 Briefly discuss what is meant by criteria, measures & dependability • Relevance • Evaluation criteria must contribute to overall system mission • Eg. Speed, response time, errors, comfort, acceptance • Linearity – usually assumed • However, for industrial performance vs. intelligence, initially, no relationship was shown • Additional analysis showed up to IQ of 90, r=0.46; between 90 and 110 r=0.04. For IQ>110, r=-0.52 (piecewise linear; see graph) • (non linear) IQ initially contributes to performance, then no impact, then too high IQ appears related to boredom see also Salvendy & Carayon (1997) supplementary reading for more detail
Q.1 Briefly discuss what is meant by criteria, measures & dependability • Homogeneity – performance vs. time of day • Note how performance changes with time • there appears to be warm up & slow down for the same operator • The lack of consistency in performance over time of day raises issues of how/when to best collect data
Methods – evaluating questionnaire data • Q.2 Which statistical technique can be used to evaluate questionnaire data?
Methods – evaluating questionnaire data • Q.2 Which statistical technique can be used to evaluate questionnaire data? • It will depend on the objective of the analysis...
Methods – evaluating questionnaire data • Which statistical technique becomes especially appropriate when questionnaire responses are measured on a nominal scale? • First, what is meant by nominal scale? • Quality categorized as high, medium, low (discrete data) • Machine breakdowns due to mechanical failure electrical failure or operator misuse • For analysis of discrete data you can use a Chi-Square analysis
Methods – evaluating questionnaire data • Q.3 Which statistical technique becomes especially appropriate when questionnaire responses are measured on a nominal scale?
Methods – evaluating questionnaire data • Q.3 Which statistical technique becomes especially appropriate when questionnaire responses are measured on a nominal scale? A Chi Squared analysis For example – you can test whether color choice is related to gender Female Male Total Green 70 40 110 chose green 38.9% 22.2% 61.1% chose green 64.6% 36.4% 64.6% of those who chose green were female Blue 30 40 70 chose blue 16.7% 22.2% 38.9% chose blue 42.9% 57.1% 57.1% of those who chose blue were male Is there a significant difference in color choice, depending on gender? A Chi square statistic (shown after analysis of the original data set in SAS) shows C2=7.5,p=0.006 (p<0.05). Hence, we would conclude yes.
Methods – evaluating questionnaire data • Q.3 How might a response scale be designed to maximize the chance the responses (on a questionnaire) will be on a ratio measurement scale? • Use continuous, numerical, and anchoredscales and do pilot testing – the average of questions/items can give a continuous measure. • A factor analysis will likely be done before an ANOVA (test of differences). • Factor analysis tries to find a factor (can think of it as a new variable) that will provide the highest set of correlations with the original variables (squares of these variables) producing the largest eigenvalue.
Examples of Scales • Eg. Range from Very important – Very unimportant • With Neither important or unimportant in the middle – could have ‘moderately’ important in between. • Or …Strongly agree, agree, undecided, disagree, strongly disagree • Very much, much, fair, a little, not at all • Much worse than usual, worse than usual, about the same, better than usual, much better than usual • Excellent, very good, good, fair, poor • All, most, a good bit, most, some, little, none • Always, very often, fairly often, sometimes, almost never, never
The Hackman & Oldham Job Satisfaction Survey (1975) How satisfied are you with this aspect of your job? __1.The amount of job security I have. __2. The amount of pay and fringe benefits I receive __3. The amount of personal growth and development I get in doing my job. __4. The people I talk to and work with on my job. __5. The degree of respect and fair treatment I receive from my boss. __6. The feeling of worthwhile accomplishment I get from doing my job. __7. The chance I get to know other people while on the job.
Methods – evaluating questionnaire data • Check eigenvalues of each factor (or variable) before deciding how many variables to include/consider - Eigenvalues should be greater than 1 for each factor that you include • Check percent of variance explained by each factor by taking the eigenvalue divided by the # of items • For example, if the eigenvalue (from linear algebra) is 3.66 with 6 questions (max. 6 units in the eigenvalue) – • then 3.66/6 or 0.61 (61%) of total variance in the questionnaire (whole questionnaire) is explained by the first factor. • You would expect a sum of at least 50% of the variance to be explained by the factors you have chosen to represent important variables.
Methods – evaluating questionnaire data • Check each factor before deciding how many variables to include/consider - Eigenvalues should be greater than 1 for each factor that you include • Check percent of variance explained by taking eigenvalue divided by the # of items (eg. If eigenvalue is 3.66 with 6 questions (max. 6 units in the eigenvalue –eg. 3.66/6 or 0.61 (61%) of total variance in the questionnaire (whole questionnaire) is explained by the first factor. • An illustration of a principal components factor analysis with varimax rotation – item coefficients are shown below…(look for coefficients > 0.4) • The values shown below are after rotation - and are easier to interpret Factor 1 Factor 2 Ques. 1 Feel blue - 0.898 0.047 Ques. 2 People stare at me - 0.165 0.935 Ques. 3 People follow me - 0.222 0.926 Ques. 4 Basically happy 0.905 - 0.279 Ques. 5 People want to hurt me - 0.549 0.544 Ques. 6 Enjoy going to parties 0.647 - 0.302 Note: an item/question that loads on two different factors (such as question 5) would likely be dropped from further consideration; these factor analyses tend to be more useful when n is large (eg. n>100 participants)
Methods – Dependability - evaluating questionnaire data (cont.) • Factor analysis tries to find a factor (you can think of it as a new variable) that will provide the highest set of correlations with the original variables (the squares of these variables) producing the largest eigenvalue.
Anchor Avoid ambiguity Avoid leading questions Measure component Measure whole Sensitivity of scale Halo effect Dependability Structured questionnaire
Internal Consistency – testing reliability of the measures • Internal consistency is the extent to which tests or procedures assess the same construct. • It is a measure of the precision between the observers or of the measuring instruments used in a study. • Cronbach’s Alpha measures how well a set of items (or variables) measure a single latent construct. • Cronbach’s Alpha can be used as a measure of internal consistency. Cronbach’s Alpha as a measure of Internal Consistency
General steps toward analysis using a questionnaire • Step 1: Conduct the survey using the structured questionnaire • Step 2: Analyze the collected data • Step 3: Make recommendation; report, presentation of results
Examples of inferences that can be drawn from questionnaire data • 1) Which generic features (across manufacturers and models) are liked most, disliked most, or having greater difficulties in usage? • 2) Which manufacturer and which model of the manufacturer is preferred regarding to each feature by the survey customers? • 3) Which generic features (within manufacturers) are liked most, disliked most, or having greater difficulties in usage?
Can a measure be valid if it is not reliable? • A measure can not be valid if it is not reliable. • If we can not measure it consistently, it is then hard to imagine that it can be correct. • Shown quantitatively… • In lab more on…Reliability of Predictors and Criterion
Criterion related validity • Reliability of Predictors and Criterion • R0 = Observed correlation (Validity) between the predictor and criterion • RT = “True” correlation (Validity) between the predictor and criterion • The “True” correlation is one previously reported – possibly in the literature. • Rp = Reliability of predictor • Rc = Reliability of criterion
Impact of Reliability on Validity • Suppose there are four tests in a battery with reliabilities of R1=0.60, R2=0.70, R3=0.78 and R4=0.92 • (eg. These could be internal consistency reliabilities); • and three criteria are utilized with the following reliabilities: r1=0.45, r2=0.60, and r3=0.75. • In other words, these reliabilities (r) are believed to be the relationships between two (or more) measures intended to be the measuring same criterion • Then: (part of Lab exercise tomorrow)
k is the number of items in the group. s2res is the variance of residual components, which can not be controlled. s2p is the variance component for person. * The alpha coefficient is interpreted as the ratio of true score variance to observed score variance. Cronbach’s Alpha as a measure of Internal Consistency
Cronbach’s Alpha as a measure of Internal Consistency • For example, suppose 13 people were asked to rate a pair of questions on a 7-point scale. • The pair of questions look different but they are testing the same item. • For example • How much do you like the weather today? • How do you feel about the weather today?
Layout of Data Sheet calculation as part of lab exercise tomorrow People Observed Score (for each question) Total (Yi.) p1 7 6 13 p2 3 5 8 p3 3 3 6 p4 4 3 7 p5 5 5 10 p6 5 3 8 p7 6 6 12 p8 5 3 8 p9 5 5 10 p10 1 2 3 p11 7 6 13 p12 2 2 4 p13 4 5 9 Total ( Y.j ) 57 54 111 (Y..)
Questionnaire Design • Supplemental material in R.W. Bailey, Human Performance Engineering 3rd Ed. pp. 559-568 (Appendix). • And Webpage as shown: http://www.ucc.ie/hfrg/resources/qfaq1.html