1 / 19

Background

tess
Download Presentation

Background

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Statistics Concept Inventory (SCI)ARTIST Roundtable Conference, Appleton, WIAugust 4, 2004Teri Reed Rhoads1, Teri J. Murphy2, Robert Terry3 Kirk Allen1, Andrea Stone2, and Marie Cohenour3University of Oklahoma1School of Industrial Engineering2Department of Mathematics3Department of Psychology

  2. Background • Statistics Concept Inventory (SCI) project began in Fall 2002 • NSF – Assessment of Student Achievement Grant # 0206977, PI: Teri Reed Rhoads, co-PI: Teri J. Murphy • Based on the format of the Force Concept Inventory (FCI) by Hestenes and Halloun • Shifts focus away from problem solving, which is the typical classroom format • Focus on conceptual understanding • Multiple choice, goal of 30 items

  3. SCI Pilot Study I (2003 FIE) • Pilot version of the instrument tested in 6 classes during Fall 2002, near the end of the semester • Intro classes: Engineering, Communications, Mathematics (2) • Advanced classes: Regression, Design of Experiments • 32 questions • Results • Males significantly outperformed females on the SCI • Mathematics majors outperformed social science majors, but no other pairs of majors differed significantly • Most of the social science majors were in a class with poor testing conditions, which may be the reason for their low scores. In addition, this course was not a calculus-based statistics course. • SCI scores positively correlated with statistics experience and a statistics attitudinal measure

  4. SCI Pilot Study II (2004 ASEE) • Instrument revised in Spring 2003 based on Fall 2002 pilot • Reported results from the revised SCI of Summer 2003 and Fall 2003 • Mathematics and Engineering Statistics courses at OU • Incorporated 3 “outside” schools, 2 colleges of engineering and one regional college mathematics statistics course • 34 question, 322 total post-semester responses • Focus on assessing and improving the validity, reliability, and discriminatory power of the instrument • Utilized focus groups, open-ended responses, psychometric analyses, and expert opinions

  5. SCI Pilot Study II (2004 ASEE) Validity • Content is the extent to which items are (1) representative of the knowledge base being tested and (2) constructed in a “sensible” manner (Nunnally) • Used focus groups, modified Delphi Survey of faculty, AP Statistics Courses Outlines, and Gibb’s Criteria • Concurrent is “assessed by correlating the test with other tests” (Klein) • For the “other test”, we used the overall course grade. Found to be valid in engineering courses, but not mathematics in summer 03 (2 courses) and in fall 03 (5 courses). • Predictive is the test’s ability to accurately predict future performance • Used Pre-test results to predict performance in course • Summer 03 lacks predictive validity both Engr and Math courses (Engr is positive correlation with course grade and Math is negative correlation, neither significant) and Fall 2003 has predictive validity for 2 Engr courses (one external) but not Math or second external Engr.

  6. SCI Pilot Study II (2004 ASEE) Reliability • A reliable instrument is one in which measurement error is small, which can also be stated as the extent that the instrument is repeatable • Most commonly measured using internal consistency • Coefficient alpha, usually Cronbach’s Alpha which is a generalization of Kuder-Richardson equation 20 (KR-20) used for dichotomous variables • Alpha above 0.80 is reliable by virtually any standard • Alpha 0.60 to 0.80 is considered reliable for classroom tests according to some references (e.g., Oosterhof) • Summer 03 the SCI is generally reliable as a post-test (end of semester) but not as a pre-test • Cause for concern – low alpha at External #1 – first administration outside OU • Fortunately, Fall 2003 shows that alpha is more constant across universities, although a new test site had somewhat lower alpha • More data needed to draw firm conclusions

  7. SCI Pilot Study II (2004 ASEE) Discrimination • Discriminatory Power • Ferguson’s Delta describes the test’s ability to produce a range of scores • Above 0.90 is consider acceptable • All courses are above 0.90 – most are above 0.93 even • Discriminatory Index • Compares top quartile to bottom quartile on each item or question within the instrument • Generally around 1/3 of the items fall into each of the ranges poor (< 0.20), moderate (0.20 to 0.40) and high (> 0.40) • Plan to remove several advanced questions which are usually poor, so this should improve

  8. SCI Pilot Study II (2004 ASEE) Item Analysis • Discriminatory index • Alpha-if-deleted • Reported by SPSS or SAS • Shows how overall alpha would change if that one item were deleted • Answer distribution • Try to eliminate or improve choices which are consistently not chosen • Focus group comments and results from open-ended responses for distracters • Recently added bifactor results

  9. Item Analysis If P (A|B) = 0.70, what is P (B|A)? a) 0.70 b) 0.30 c) 1.00 d) 0 e) Not enough information (** correct **) f) Other: __________________________ • Question which was totally changed • Fall 2002: • Discriminatory index poor (0.16) • Alpha-if-deleted above the overall alpha (deleting the item would increase alpha) • Too symbol-oriented, not focused on the concept • Topic of conditional probability too important to delete (faculty survey)

  10. Item Analysis • Replacement item • In a manufacturing process, the error rate is 1 in 1000. However, errors often occur in bursts. Given that the previous output contained an error, what is the probability that the next unit will also contain an error? • a) Less than 1 in 1000 • b) Greater than 1 in 1000 (** correct **) • c) Equal to 1 in 1000 • d) Insufficient information

  11. Item Analysis • Summer 2003: • Three of four classes have discriminatory indices above 0.30 (max 0.55) • Same three also have positive effect on alpha • Focus groups: comments on “non-memoryless” property and bursts would “throw off the odds” • Possible problem: some students chose D because unsure how a “burst” is defined

  12. Current Progress on the SCI • Currently working on construct validity by performing and analyzing results from factor analysis of Fall 2003 scores (approx. 350 students in 7 courses) • Define content sub-topics from which we can calculate sub-scores and examine the validity and reliability of parts of the instrument • Probability, Descriptive Statistics, Inferential Statistics, Graphical Displays are the current sub-topics. We are also working with the advance topics sub-topic area, though with a very small n. • Continue to improve the instrument by modifying items which are performing poorly based on previously discussed validity, reliability, and discriminatory measures. • Looking for persons interested in administering instrument in calculus-based statistics courses a minimum of one time at the end of the semester.

  13. Current Progress on the SCI • Added graphical questions for graphical sub-topic. • 37 total multiple-choice questions • Probability (9) • Descriptive Statistics (8) • Inferential Statistics (13) • Graphical Representation (7) • Advanced Topics (5) • Given with brief demographic questionnaire • Ultimately will be joined with attitude instrument and teaching styles instrument

  14. Contact Information • Website: http://coecs.ou.edu/sci/ • Information on scores from various classes • Other papers or presentations relating to SCI • Example report that participating instructor receives after reporting data and final course grades • Email: • teri.rhoads@ou.edu • tjmurphy@math.ou.edu

  15. Concurrent Validity • Summer 2003 – 2 courses, one in Math dept., one in Engr. dept. • Valid for Engr. but not for Math • Different teaching style, topic coverage, textbook

  16. Concurrent Validity • For Fall 2003 – Added 2 external universities (3 classes total, intro level, Engineering depts.), along with Engr & Math • Valid as a post-test for all four engineering stats courses, but again not for Math

  17. Reliability • The SCI is generally reliable as a post-test (end of semester) but not as a pre-test • Summer 2003

  18. Reliability

  19. Discriminatory Index – Post Tests

More Related