170 likes | 301 Views
Contextualizing The Meaning of Probabilities C. Y. Joanne Peng, Anne Buu, and Bernard Flury Indiana University-Bloomington. Paper presented at the 1998 Taipei International Statistical Symposium August 15-17, 1998, Taipei, Taiwan, Republic of China
E N D
Contextualizing The Meaning of ProbabilitiesC. Y. Joanne Peng, Anne Buu, and Bernard FluryIndiana University-Bloomington • Paper presented at the 1998 Taipei International Statistical Symposium • August 15-17, 1998, Taipei, Taiwan, Republic of China • A copy of this paper and its accompanying handout can be located at URL—http://www.indiana.edu/~jopeng/98tiss/
Research Questions • 1. What is the effect of the magnitude of numbers in the fractional restatement on the use of probability expressions by prospective teachers? • 2. To what degree prospective teachers’ expectation of the real probability of the event influenced their use of probability expressions? • 3. What numeric meanings can be derived from probability expressions in the numeric-to-lexical mappings within educational contexts? • 4. What are the differences in scaled values of probability expressions estimated by the least squares method versus those estimated by the logit modeling? • 5. Any inconsistency in results between the 1992 and the 1998 studies?
Subjects • 1992 Sample • C 188 undergraduates enrolled at a large mid-western university. • C 81% were education majors. • C 33 (18%) were male and 155 (82%) were female. • C Age ranged from 18 to 44 years old with 96% falling into the traditional student range of 18-22 years old. • C Approximately 65% planned to teach at the elementary school level. • C Over 70% were sophomores. • 1998 Sample • C 100 undergraduates enrolled at the same university. • C 93% were education majors. • C 18 (18%) were male and 82 (82%) were female. • C Age ranged from 18 to 28 years old with 98% falling into the traditional student range of 18-22 years old. • C Approximately 83% planned to teach at the elementary school level. • C Over 70% were sophomores.
Instrument • C An interactive computer program developed by Peng and Bolte (1992) was used for stimulus presentation and data recording. • C Three tasks were administered in the sequence of: (1) the expectation task, (2) the paired comparison task, and (3) the probability restatement task. • C When the instrument was administered, the order of presentation of stimuli within each task was randomized by the computer program. • C In general, subjects were able to complete all tasks in less than 20 minutes.
Procedure: the First (Expectation) Task • C The first (expectation) task asked subjects to indicate their expectations regarding the reported percentage in 21 educationally related statements which were extracted from reputable references. • C Each statement contained a numeric percent which was near one of the medians (7%, 22%, 36%, 50%, 64%, 79%, and 93%) for 7 intervals that equally divided percentages from 0 to 100. • C The numeric percentage in each statement has been judged to be equal to, higher than, or lower than what was expected by the majority of the subjects in the pilot study. Thus, in each of the 7 probability intervals, three statements were presumed to carry different expectation information.
21 Educationally Related Statements • C (1E) Of those children in public school, 12% are in federally supported special education programs. • C (1H) Of parents who abuse their child, 7% will improve without outside assistance. • C (1L) In high school, girls constituted 7% of interscholastic sports participants in 1971. • C (2E) It is reported that 20% of anorexia starve to death. • C (2H) Students who plan to go to a four-year college take 29% of the vocational education courses offered in high school. • C (2L) The average elementary teachers spends 22% of their workday interacting verbally with students.
21 Educationally Related Statements • C (3E) Of adolescent girls who become pregnant, 37% do so in the first three months of sexual activity. • C (3H) Of those children who are homeless, 38% attend school on a regular basis. • C (3L) In a national poll, 36% of the parents said they would not like their child to take up teaching as a career. • C (4E) Baccalaureate degrees are obtained within 4 years of graduation from high school by 49% of college students. • C (4H) Minority high school graduates go on to college at a rate of 48%. • C (4L) Teacher salaries were considered too low by 50% of respondents in a Gallup poll last year. • C (5E) Occasional use of cocaine is believed to be a great risk by 65% of high school seniors. • C (5H) White students make up 66% of high school dropouts. • C (5L) Corporal punishment by teachers should not be allowed in the opinion of 67% of Americans.
21 Educationally Related Statements • C (6E) When asked about sex education, 72% of respondents said they would require it for high school students. • C (6H) By the year 2000, 75% of jobs will require less than a college degree. • C (6L) Of the respondents to a recent Gallup survey, 77% said they would require AIDS education for high school students. • C (7E) Of high schoolers responding to a national survey, 93% said they believe their peers cheat sometimes. • C (7H) Three years ago, 91% of Americans between the ages of 25 and 29 held high school diplomas or GED (General Education Development) certificates. • C (7L) In the United States, 90% of the households have a television.
Procedure: the Second (Paired Comparison) Task • C Each of 11 most commonly used probability expressions was paired with the remaining 10. Subjects were asked to choose, in each pair, an expression which implies a greater likelihood of occurrence for an event. • C The point- & interval- estimates of medians from Reagan et al. (1989): • “almost impossible”(2%, 0%-5%) • “very unlikely”(10%, 2%-15%) • “improbable”(15%,10%-20%) • “unlikely”(15%, 10%-25%) • “possible”(40%, 40%-70%) • “an even chance”(50%, 45%-55%) • “probable”(70%, 60%-75%) • “likely”(70%, 65%-85%) • “very probable”(80%, 75%-90%) • “very likely”(85%, 80%-90%) • “almost certain”(90%, 90%-100%)
Procedure: the Third (Probability Restatement) Task • C The third (probability restatement) task asked subjects to select one probability expression (out of 11) which best conveyed the meaning of the percentage (numeric probability) embedded in each of the 21 educational statements. • C About a third of the subjects were presented the 21 statements without any fractional restatement (i.e. exactly the same as the 21 statements used in the expectation task). For example, “In a national poll, 36% of the parents said they would not like their child to take up teaching as a career.” • C Another one third were presented the statements with fractional restatements using 100 as the denominator. For example, “In a national poll, 36% of the parents - or 36 of every 100 - said they would not like their child to take up teaching as a career.” • C The remaining one third were presented the statements with fractional restatements using 100,000 as the denominator. For example, “In a national poll, 36% of the parents - or 36000 of every 100000 - said they would not like their child to take up teaching as a career.”
Results: the Effect of the Magnitude of Numbers in the Fractional Restatement on the Use of Probability Expressions • C This context effect was analyzed statement by statement. • C For each of the 21 statements, a 3 by 11 contingency table was constructed to test if the magnitude of numbers in the fractional restatement had an impact on the subject’s selection of probability expressions. • C Since the ratio of our 1998 sample size (100) to the number of cells (33) was relatively low (less than 5), the asymptotic Chi-Square tests were invalid, therefore, Fisher’s exact test was adopted to test the association (Agresti, 1990). • C The p-values for the 21 contingency tables were all insignificant at the .05 level. Based on these results, we concluded that this context effect was minimum, hence, ignored it in all subsequent analyses.
Results: the Effect of Subjects’ Expectation of the Real Probability of the Event on Their Use of Probability Expressions • C An analysis of subjects’ expectations obtained in 1998 regarding the reported percentage in each of the 21 statements in the first task revealed that most expectations agreed with those collected from the 1992 study. • C For both studies, 18 out of 21 statements (86%) were expected in the presumed direction. This supported our intended manipulation in order to invoke different expectations in subjects about the likelihood of events stated in 21 statements. • C This context effect of expectation was presented in Figures 1A through 11A for the 1998 data and in Figures 1B through 11B for the 1992 data.
Results: Numeric Meanings of Probability Expressions in the Numeric-to-lexical Mapping within Educational Contexts • C “Almost impossible” and “very unlikely” anchored the low end of numerical probabilities; when contexts were considered, “almost impossible” was judged to convey lower probabilities than “very unlikely.” • C Expressions incorporating the stem “probable” seemed to be used less frequently than expressions incorporating the stem “likely.” • C “Unlikely” and “possible” both expressed a broad range of probabilities less than 50%. • C On the scale of 0% to 100%, “an even chance” had an intrinsic meaning of 50%. • C “Probable” and “likely” both corresponded to the two intervals adjacent to the middle interval (centered around 50%) while meanings of “likely” gravitated toward slightly higher probabilities than “probable.” • C While “very probable” and “very likely” both corresponded to a broad range of probabilities larger than 50%, “very likely” peaked around 79%. • C “Almost certain” anchored the high end of the probability scale.
Two Statistical Methods Used to Analyze the Paired Comparison Data • C The first method was the least squares unidimensional scaling applied to incomplete data, assuming a constant variance of discriminal differences for all pairs of stimuli (Gulliksen, 1956; Torgerson, 1958). This assumption is identical to Thurstone’s case V or Torgerson’s condition C (Thurstone, 1927; Torgerson, 1958). In the 1998 data, “likely” was judged to imply greater probability than “very unlikely” by all subjects so the transformation of this observed proportion (1.00) to the corresponding unit normal deviate could not be made. Thus, the solution for incomplete data was used. • C The second method was the logit modeling using 11 indicator variables to model a total of 55 (11*10/2) contrasts within 55 pairs of probability expressions (Agresti, 1990; Bradley & Terry, 1952; Flury, 1997). For each pair of expressions i…j, suppose the expression i is chosen nij times (out of Nij) over the other expression j. When the Nij comparisons are independent, with the same probability Aij applying to each comparison, nij has a binomial (Nij , Aij) distribution. If the comparisons for different pairs of expressions are also assumed to be independent, logit modeling can be applied to these pair comparison data.
Results: the Comparison of the Least Squares Scales and the Logit Scales • Least Squares & Logit Scales for the 1992 & the 1998 Paired-Comparison Data
Results: the Comparison of the Least Squares Scales and the Logit Scales • Correlations between Paired-Comparison Scales
Results: the Consistency between the 1992 and the 1998 Study • There were no differences between the 1992 and the 1998 findings regarding the context effects, numeric meanings of probability expressions, and paired comparison scales. This observation suggested consistency in subjects’ behavior between years.