710 likes | 965 Views
AutoTutor: An Intelligent Tutoring System with Mixed Initiative Dialog. Art Graesser University of Memphis Department of Psychology & the Institute for Intelligent Systems Supported on grants from the NSF, ONR, ARI, IDA, IES, US Census Bureau, and CHI Systems. Interdisciplinary Approach.
E N D
AutoTutor: An Intelligent Tutoring System withMixed Initiative Dialog Art Graesser University of Memphis Department of Psychology & the Institute for Intelligent Systems Supported on grants from the NSF, ONR, ARI, IDA, IES, US Census Bureau, and CHI Systems
Interdisciplinary Approach Psychology Education Computational Linguistics Computer Science
Overview • Brief comments on my research on question asking and answering • Primary focus is on AutoTutor -- a collaborative reasoning and question answering system
Overview of my Research on Questions • Psychological Models • Question asking (PREG, ONR, NSF, ARI) • Question answering (QUEST, ONR) • Computer Artifacts • Tutor (AutoTutor, Why/AutoTutor, Think like a commander, NSF, ONR, ARI, CHI Systems) • Survey question critiquer (QUAID, US Census, NSF) • Point & Query software (P&Q, ONR) • Query-based information retrieval (HURA Advisor, IDA)
AutoTutor Collaborative reasoning and question answering in tutorial dialog
1 Trouble in McLouth 2 Save the Shrine 3 The Recon Fight 4 A Shift In Forces 5 The Attack Begins 6 The Bigger Picture 7 Looking Deep 8 Before the Attack 9 Meanwhile Back at the Ranch Think Like a Commander Vignettes
Themes: • Keep Focus on Mission? Higher’s Intent? • Model a Thinking Enemy? • Consider Effects of Terrain? • Use All Assets Available? • Consider Timing? • See the Bigger Picture? • Visualize the Battlefield • Accurately? - Realistic Space-Time Forecast • Dynamically? - Entities Change Over Time • Proactively? - What Can I Make Enemy Do • Consider Contingencies and Remain Flexible?
What does AutoTutor do? • Asks questions and presents problems Why? How? What-if? What is the difference? • Evaluates meaning and correctness of the learner’s answers (LSA and computational linguistics) • Gives feedback on answers • Face displays emotions + some gestures • Hints • Prompts for specific information • Adds information that is missed • Corrects some bugs and misconceptions • Answers student question • Holds mixed-initiative dialog in natural language
Pedagogical Design Goals • Simulate normal human tutors and ideal tutors • Active construction of student knowledge rather than information delivery system • Collaborative answering of deep reasoning questions • Approximate evaluation of student knowledge rather than detailed student modeling • A discourse prosthesis
Feasibility of Natural Language Dialog in Tutoring • Learners are forgiving when the tutor’s dialog acts are imperfect. • They are even more forgiving when the bar is set low during instructions. • There are learning gains. • Learning is not correlated with liking.
Human Tutors • Analyze hundreds of hours human tutors • Research methods in college students • Basic algebra in 7th grade • Typical unskilled cross-age tutors • Studies from the Memphis labs • Graesser & Person studies • Studies from other labs • Chi, Evens, McArthur …
Characteristics of students that we wish were better • Student question asking • Comprehension calibration • Self-regulated learning, monitoring, & and error correction • Precise, symbolic articulation of knowledge • Global integration of knowledge • Distant anaphoric reference • Analogical reasoning • Application of principles to a practical problem
Pedagogical strategies not used by unskilled tutors • Socratic method (Collins, Stevens) • Modeling-scaffolding-fading (Rogoff) • Reciprocal training (Brown, Palincsar) • Anchored Learning (Bransford,Vye, CTGV) • Error diagnosis & repair (Anderson, van Lehn, Lesgold) • Building on prerequisites (Gagne) • Cascade techniques (van Lehn, Schank) • Sophisticated motivational techniques (Lepper)
AutoTutor Language Extraction Problem selection Speech act classifier Dialog Management Latent Semantic Analysis Curriculum Script Talking head with gestures
Managing One AutoTutor Turn • Short feedback on the student’s previous turn • Advance the dialog by one or more dialog moves that are connected by discourse markers • End turn with a signal that transfers the floor to the student • Question • Prompting hand gesture • Head/gaze signal
Dialog Moves • Positive immediate feedback: “Yeah” “Right!” • Neutral immediate feedback: “Okay” “Uh huh” • Negative immediate feedback: “No” “Not quite” • Pump for more information: “What else?” • Hint: “How does tossing the pumpkin affect horizontal velocity?” • Prompt for specific information: “Vertical acceleration does not affect horizontal _______.” • Assert: “Vertical acceleration does not affect horizontal velicity.” • Correct: “Air resistance is negligible” • Repeat: “So, once again, how does tossing the pumpkin affect horizontal velocity?” • Summarize: “So to recap, [succinct summary].”
Sample dialog Tutor:This might help. If the runner speeds up will the pumpkin land behind the runner, next to the runner, or in front of the runner? The pumpkin will land behind the runner. Tutor:Right. What is the horizontal force applied to the pumpkin after it leaves the runner's hands? There is no horizontal force applied to the pumpkin after it leaves the runner's hands. Tutor:Right. expectation(1): 0.836 expectation(2): 0.851 expectation(3): 0.584 misconception (1) 0.371 misconception (2): 0.605
As zero force is acting on it, it will have constant velocity hence it will land on his head. .51 .66 .44 Expectation-1 The pumpkin has the same horizontal velocity as the runner. Expectation-2 Zero force is needed to keep an object going with constant velocity. Expectation-3 Vertical forces on the pumpkin do not affect its horizontal velocity.
The horizontal velocity of the pumpkin is the same as the runner.
The horizontal velocity of the pumpkin is the same as the runner. .99 .66 .87 Expectation-1 The pumpkin has the same horizontal velocity as the runner. Expectation-2 Zero force is needed to keep an object going with constant velocity. Expectation-3 Vertical forces on the pumpkin do not affect its horizontal velocity.
How does Why/AutoTutor select the next expectation? • Don’t select expectations that the student has covered cosine(student answers, expectation) > threshold • Frontier learning, zone of proximal development Select highest sub-threshold expectation • Coherence Select next expectation that has highest overlap with previously covered expectation • Pivotal expectations
How does AutoTutor know which dialog move to deliver? Dialog Advancer Network (DAN) for mixed-initiative dialog 15 Fuzzy production rules Quality of the student’s assertion(s) in preceding turn Student ability level Topic coverage Student verbosity (initiative) Hint-Prompt-Assertion cycles for expected good answers
Cycle fleshes out one expectation at a time Exit cycle when: cos(S, E ) > T S = student input E = expectation T = threshold Hint-Prompt-Assertion Cycles to Cover Good Expectations Hint Prompt Assertion Hint Prompt Assertion
Who is delivering the answer? STUDENT PROVIDES INFORMATION Pump Hint Prompt Assertion TUTOR PROVIDES INFORMATION
Question Taxonomy QUESTION CATEGORY GENERIC QUESTION FRAMES AND EXAMPLES 1. Verification Is X true or false? Did an event occur? Does a state exist? 2. Disjunctive Is X, Y, or Z the case? 3. Concept completion Who? What? When? Where? 4. Feature specification What qualitative properties does entity X have? 5. Quantification What is the value of a quantitative variable? How much? How many? 6. Definition questions What does X mean? 7. Example questions What is an example or instance of a category?). 8. Comparison How is X similar to Y? How is X different from Y? 9. Interpretation What concept/claim can be inferred from a static or active data pattern? 10. Causal antecedent What state or event causally led to an event or state? Why did an event occur? Why does a state exist? How did an event occur? How did a state come to exist? 11. Causal consequence What are the consequences of an event or state? What if X occurred? What if X did not occur? 12. Goal orientation What are the motives or goals behind an agent’s action? Why did an agent do some action? 13. Instrumental/procedural What plan or instrument allows an agent to accomplish a goal? How did agent do some action? 14. Enablement What object or resource allows an agent to accomplish a goal? 15. Expectation Why did some expected event not occur? Why does some expected state not exist? 16. Judgmental What value does the answerer place on an idea or advice? What do you think of X? How would you rate X?
Speech Act Classifier Assertions Questions (16 categories) Directives Metacognitive expressions (“I’m lost”) Metacommunicative expressions (“Could you say that again?”) Short Responses 95% Accuracy on tutee contributions
A New Query-based Information Retrieval System(Louwerse, Olney, Mathews, Marineau, Hite-Mitchell, Graesser, 2003) Input speech act Syntactic Parser Lexicons Surface cues Frozen expressions Classify speech act QUEST’s 16 question categories, assertion, directive, other Word particles of question category Augment retrieval cues Input context:Text and Screen Search documents via LSA Select Highest Matching Document
Evaluations of AutoTutor
LEARNING GAINS (effect sizes) .42 Unskilled human tutors (Cohen, Kulik, & Kulik, 1982) .75 AutoTutor (7 experiments) (Graesser, Hu, Person) 1.00 Intelligent tutoring systems PACT (Anderson, Corbett, Koedinger) Andes, Atlas (VanLehn) 2.00 (?) Skilled human tutors
Spring 2002 EvaluationsConceptual Physics(VanLehn & Graesser, 2002) Four conditions • Human tutors • Why/Atlas • Why/AutoTutor • Read control 86 College Students
Measures in Spring Evaluation • Multiple Choice Test • Pretest and posttest (40 multiple choice questions in each) • Essays graded by 6 physics experts • 4 pretest and 4 posttest essays • Expectations versus misconceptions • Wholistic grades • Generic principles and misconceptions (fine-grained) • Learner perceptions • Time on Tasks
Effect Sizes on Learning Gains(pretest to posttest, no differences among tutoring conditions)
Fall 2002 EvaluationsConceptual Physics(Graesser, Moreno, et al., 2003) Three tutoring conditions • Why/AutoTutor • Read textbook control • Read nothing 63 subjects
2002-3 EvaluationsComputer Literacy(Graesser, Hu, et al., 2003) 2 Tutoring Conditions • AutoTutor • Read nothing 4 Media Conditions • Print • Speech • Speech+Head • Speech+Head+Print 96 subjects
What Expectations are LSA-worthy? Compute correlation between: • Experts’ ratings of whether essay answers have expectation E • Maximum LSA cosine between E and all possible combinations of sentences in essay A high correlation means the expectation is LSA-worthy
Expectations and Correlations (expert ratings, LSA) • After the release, the only force on the balls is the force of the moon’s gravity (r = .71) • A larger object will experience a smaller acceleration for the same force (r = .12) • Force equals mass times acceleration (r = .67) • The boxes are in free fall (r = .21)