350 likes | 449 Views
The Andes Intelligent Tutoring System: Five years of evaluations. Kurt VanLehn Pittsburgh Science of Learning Center (PSLC) University of Pittsburgh. The physics LearnLab course committee. Andes development Anders Weinstein Brett van de Sande Kurt VanLehn (co-chair) U.S. Naval Academy
E N D
The Andes Intelligent Tutoring System:Five years of evaluations Kurt VanLehn Pittsburgh Science of Learning Center (PSLC) University of Pittsburgh
The physics LearnLab course committee • Andes development • Anders Weinstein • Brett van de Sande • Kurt VanLehn (co-chair) • U.S. Naval Academy • Don Treacy (co-chair) • Bob Shelby • Mary Wintersgill • Kay Schulze • Experimenters • Scotty Craig • Sandy Katz • Bob Hausmann • Michael Ringenberg • Meet weekly • Thursdays, 3:30
Funding • The U. S. Office of Naval ResearchCognitive Science Program • The U.S. National Science FoundationPittsburgh Science of Learning Center
Research question • Given • Whole semester of instruction • No change to content of course • No change to lectures, labs, assignments • Standard exams (not designed by experimenters) • Can a homework helper increase learning?
Prior work with answer-only tutoring steps • Web-based homework grading systems • E.g., Web-assign, CAPA, Mastering Physics • Provide feedback & hints on the answer only • Compared to ordinary paper-based homework • Positive benefits • When paper-based homework is collected & graded • No benefits (Pascarella, 2002; Dufresne, Mestre & Rath, 2002) • Interpretation • Motivating students to do their homework provides benefits, but the answer-only tutoring system provides no additional benefits
Prior work with tutoring systems that give feedback & hints on steps • Lisp Tutor (Corbett, 2001) and many others • Same homework problems & text • Experimenter’s exams only • But not a whole semester (only 5 lessons) • Pump curriculum + Pat tutor (Koedinger et al) • Whole year of high-school algebra • Both experimenter’s exams & standard exams • But content confounded with tutoring system • Earlier evaluations of Andes • First half-semester only • Experimenter’s exams only
Why does it matter? • Ideally, an intelligent homework helper… • can increase learning without changing the course, and • the increase is strong enough to show in final exam • The diligent always do well & slackers always do poorly • Cramming • If not… • still useful if it facilitates content upgrades, and • the upgrades cause robust increases in learning
Outline • Andes • Evaluation • Discussion Next
What kind of physics? • US university introductory physics courses • US high school advanced physics courses • A typical problem: If a 2000 kg car at the top of a 20 degree inclined driveway 20 m long slips its parking brake and rolls down. If we ignore friction and drag, what is the magnitude of the velocity of the car when it hits the garage door?
Andes user interface Read a physics problem Draw vectors Type in equations Type in answer
Andes feedback and hints “What should I do next?” “What’s wrong with that?” Green means correctRed means incorrect Dialogue & hints
Major challenges • Dealing with equations • Giving red/green feedback • Undoing algebraic combination • For “what should I do next?” • Analyzing errors in equations • Scale-up • 13 chapters, 500 textbook pages • 350+ problems • 300+ principles
Outline • Andes • Evaluation • Method • Main results • Which students benefited? • Which knowledge benefited? • Interpretation of results • Discussion Next
Evaluations of Andes at the US Naval Academy • Fall semesters 2000, 2001, 2002 & 2003 • Only the homework modality was varied: Andes vs. paper-based • Same textbook • Similar lectures, labs, recitations • Similar homework problems • Same exams • Students were motivated to do paper-based homework • Either collected and graded • Or 1 homework problem on each quiz
Exams • Midterm exam • 1 hour, 4 problems • Scored on derivation & answer • Drawings (30%) • Variable definitions (20%) • Equations (40%) • Answers (10%) • Final exam • 3 hours, 50 problems • Multiple choice Next
Checking prior competence of Andes and control students • Grade-point averages equal • Distribution of majors equal • Engineering majors vs. • Science majors vs. • Other majors
Midterm exam results(All differences reliable, p < .01) How to calculate effect size?
Calculating effect size over 4 different midterm exams • Normalize each score z_score(student) = [raw_score(student) – mean(exam)] / standard_deviation(exam) • For each condition, pool z-scores across years • Effect size = 0.61
Final exam • Exam covers 100% of course, but Andes didn’t • Does now • Use 2003 exam only; Andes covered 70% • 89 Andes students • 823 non-Andes students
Prior competence not equal • Majors not equally distributed • Andes group had more engineering majors • GPAs not equally distributed • Andes group had marginally higher GPAs • Factor out prior competence statistically • For each major, regress GPA on final exam score • Residual_score(student) = raw_score(student) – predicted_score(student’s major, student’s GPA)
Final exam results Difference is reliable (p = 0.028) Effect size = 0.25
Outline • Andes • Evaluation • Method • Main results • Which students benefited? • Which knowledge benefited? • Interpretation of results • Discussion Next
Benefits varied by major on final exam but not on midterm exam
Outline • Andes • Evaluation • Method • Main results • Which students benefited? • What knowledge benefited? • Interpretation of results • Discussion Next
Interpretation of results • Engineering & science majors learned the red path and prefer it • Andes does not increase their final exam scores • They use blue path on the midterm • Andes increases their midterm exam scores • Other majors do not have red path, so they use the blue path on both exams • Andes increases both exams’ scores • On midterm exams, subscores measure components of blue path separately • Biggest benefit for diagrams & variables • Smaller on equations; none on answer Problem Andes Diagram & variables Prior physics Andes Equations Prior math & physics Answer
Summary of results • Main result: Andes provides benefits • Midterm exam effect size: 0.61 • Final exam effect size: 0.25 • Andes helps students learn conceptual skills • Effect sizes on conceptual subscores: 1.21 & 0.69 • Effect sizes on calculational subscores: 0.11 & -0.08 • Some students appear to have a non-conceptual method for solving problems • Competes with the conceptual method taught by Andes • They use it on the (answer-only) final exam • This dilutes the benefit of Andes on final exam
Outline • Andes • Evaluation • Discussion • Andes compared to others • Why is Andes effective? Next
Effect sizes on experimenter’s & standard exams of 3 tutoring systems
Interpretation of the comparison with other tutors • Andes is about the same as other tutoring systems that give feedback and hints on steps • Perhaps the Pump+Pat benefits are due solely to the tutoring system and not the content upgrade
Summary: Studies of homework helpers when content is controlled Ordinary paper-based homework Large benefits Motivated paper-based homework No benefits Feedback & hints on answer only Large benefits Feedback & hints on steps
Outline • Andes • Evaluation • Discussion • Andes compared to others • Why is feedback & hints on steps so effective? Next
Hypothesis: Andes increases the number of successful knowledge events • Without feedback & hints on steps, students skip them • Guess • Copy similar example’s step & edit • Copy & edit a higher goal’s outcome • Doing a step correctly requires • Figuring out how the first time (sense-making) • Figuring out why the second & third times (refinement) • Recalling why & how the other times (fluency building) • This increases number of successful knowledge events • Wherein a student constructs or applies a knowledge component
Thanks for your attention! • At www.andes.pitt.edu • Download stand-alone version of Andes • Try OLI version of Andes • Download papers on Andes • Sorry, but Andes only runs on Windows