AutoTutor: An Intelligent Tutoring System with Mixed Initiative Dialog

AutoTutor: An Intelligent Tutoring System withMixed Initiative Dialog Art Graesser University of Memphis Department of Psychology & the Institute for Intelligent Systems Supported on grants from the NSF, ONR, ARI, IDA, IES, US Census Bureau, and CHI Systems

Interdisciplinary Approach Psychology Education Computational Linguistics Computer Science

Overview • Brief comments on my research on question asking and answering • Primary focus is on AutoTutor -- a collaborative reasoning and question answering system

Overview of my Research on Questions • Psychological Models • Question asking (PREG, ONR, NSF, ARI) • Question answering (QUEST, ONR) • Computer Artifacts • Tutor (AutoTutor, Why/AutoTutor, Think like a commander, NSF, ONR, ARI, CHI Systems) • Survey question critiquer (QUAID, US Census, NSF) • Point & Query software (P&Q, ONR) • Query-based information retrieval (HURA Advisor, IDA)

AutoTutor Collaborative reasoning and question answering in tutorial dialog

1 Trouble in McLouth 2 Save the Shrine 3 The Recon Fight 4 A Shift In Forces 5 The Attack Begins 6 The Bigger Picture 7 Looking Deep 8 Before the Attack 9 Meanwhile Back at the Ranch Think Like a Commander Vignettes

Themes: • Keep Focus on Mission? Higher’s Intent? • Model a Thinking Enemy? • Consider Effects of Terrain? • Use All Assets Available? • Consider Timing? • See the Bigger Picture? • Visualize the Battlefield • Accurately? - Realistic Space-Time Forecast • Dynamically? - Entities Change Over Time • Proactively? - What Can I Make Enemy Do • Consider Contingencies and Remain Flexible?

What does AutoTutor do? • Asks questions and presents problems Why? How? What-if? What is the difference? • Evaluates meaning and correctness of the learner’s answers (LSA and computational linguistics) • Gives feedback on answers • Face displays emotions + some gestures • Hints • Prompts for specific information • Adds information that is missed • Corrects some bugs and misconceptions • Answers student question • Holds mixed-initiative dialog in natural language

Pedagogical Design Goals • Simulate normal human tutors and ideal tutors • Active construction of student knowledge rather than information delivery system • Collaborative answering of deep reasoning questions • Approximate evaluation of student knowledge rather than detailed student modeling • A discourse prosthesis

Feasibility of Natural Language Dialog in Tutoring • Learners are forgiving when the tutor’s dialog acts are imperfect. • They are even more forgiving when the bar is set low during instructions. • There are learning gains. • Learning is not correlated with liking.

DEMO

Human Tutors • Analyze hundreds of hours human tutors • Research methods in college students • Basic algebra in 7th grade • Typical unskilled cross-age tutors • Studies from the Memphis labs • Graesser & Person studies • Studies from other labs • Chi, Evens, McArthur …

Characteristics of students that we wish were better • Student question asking • Comprehension calibration • Self-regulated learning, monitoring, & and error correction • Precise, symbolic articulation of knowledge • Global integration of knowledge • Distant anaphoric reference • Analogical reasoning • Application of principles to a practical problem

Pedagogical strategies not used by unskilled tutors • Socratic method (Collins, Stevens) • Modeling-scaffolding-fading (Rogoff) • Reciprocal training (Brown, Palincsar) • Anchored Learning (Bransford,Vye, CTGV) • Error diagnosis & repair (Anderson, van Lehn, Lesgold) • Building on prerequisites (Gagne) • Cascade techniques (van Lehn, Schank) • Sophisticated motivational techniques (Lepper)

What can AutoTutor (and most human tutors) handle?

AutoTutor Language Extraction Problem selection Speech act classifier Dialog Management Latent Semantic Analysis Curriculum Script Talking head with gestures

Managing One AutoTutor Turn • Short feedback on the student’s previous turn • Advance the dialog by one or more dialog moves that are connected by discourse markers • End turn with a signal that transfers the floor to the student • Question • Prompting hand gesture • Head/gaze signal

Dialog Moves • Positive immediate feedback: “Yeah” “Right!” • Neutral immediate feedback: “Okay” “Uh huh” • Negative immediate feedback: “No” “Not quite” • Pump for more information: “What else?” • Hint: “How does tossing the pumpkin affect horizontal velocity?” • Prompt for specific information: “Vertical acceleration does not affect horizontal _______.” • Assert: “Vertical acceleration does not affect horizontal velicity.” • Correct: “Air resistance is negligible” • Repeat: “So, once again, how does tossing the pumpkin affect horizontal velocity?” • Summarize: “So to recap, [succinct summary].”

Discourse Marker Relations

Sample dialog Tutor:This might help. If the runner speeds up will the pumpkin land behind the runner, next to the runner, or in front of the runner? The pumpkin will land behind the runner. Tutor:Right. What is the horizontal force applied to the pumpkin after it leaves the runner's hands? There is no horizontal force applied to the pumpkin after it leaves the runner's hands. Tutor:Right. expectation(1): 0.836 expectation(2): 0.851 expectation(3): 0.584 misconception (1) 0.371 misconception (2): 0.605

As zero force is acting on it, it will have constant velocity hence it will land on his head. .51 .66 .44 Expectation-1 The pumpkin has the same horizontal velocity as the runner. Expectation-2 Zero force is needed to keep an object going with constant velocity. Expectation-3 Vertical forces on the pumpkin do not affect its horizontal velocity.

The horizontal velocity of the pumpkin is the same as the runner.

The horizontal velocity of the pumpkin is the same as the runner. .99 .66 .87 Expectation-1 The pumpkin has the same horizontal velocity as the runner. Expectation-2 Zero force is needed to keep an object going with constant velocity. Expectation-3 Vertical forces on the pumpkin do not affect its horizontal velocity.

How does Why/AutoTutor select the next expectation? • Don’t select expectations that the student has covered cosine(student answers, expectation) > threshold • Frontier learning, zone of proximal development Select highest sub-threshold expectation • Coherence Select next expectation that has highest overlap with previously covered expectation • Pivotal expectations

How does AutoTutor know which dialog move to deliver? Dialog Advancer Network (DAN) for mixed-initiative dialog 15 Fuzzy production rules Quality of the student’s assertion(s) in preceding turn Student ability level Topic coverage Student verbosity (initiative) Hint-Prompt-Assertion cycles for expected good answers

Dialog Advancer Network

Cycle fleshes out one expectation at a time Exit cycle when: cos(S, E ) > T S = student input E = expectation T = threshold Hint-Prompt-Assertion Cycles to Cover Good Expectations Hint Prompt Assertion Hint Prompt Assertion

Who is delivering the answer? STUDENT PROVIDES INFORMATION Pump Hint Prompt Assertion TUTOR PROVIDES INFORMATION

Correlations between dialog moves and student ability

Question Taxonomy QUESTION CATEGORY GENERIC QUESTION FRAMES AND EXAMPLES 1. Verification Is X true or false? Did an event occur? Does a state exist? 2. Disjunctive Is X, Y, or Z the case? 3. Concept completion Who? What? When? Where? 4. Feature specification What qualitative properties does entity X have? 5. Quantification What is the value of a quantitative variable? How much? How many? 6. Definition questions What does X mean? 7. Example questions What is an example or instance of a category?). 8. Comparison How is X similar to Y? How is X different from Y? 9. Interpretation What concept/claim can be inferred from a static or active data pattern? 10. Causal antecedent What state or event causally led to an event or state? Why did an event occur? Why does a state exist? How did an event occur? How did a state come to exist? 11. Causal consequence What are the consequences of an event or state? What if X occurred? What if X did not occur? 12. Goal orientation What are the motives or goals behind an agent’s action? Why did an agent do some action? 13. Instrumental/procedural What plan or instrument allows an agent to accomplish a goal? How did agent do some action? 14. Enablement What object or resource allows an agent to accomplish a goal? 15. Expectation Why did some expected event not occur? Why does some expected state not exist? 16. Judgmental What value does the answerer place on an idea or advice? What do you think of X? How would you rate X?

Speech Act Classifier Assertions Questions (16 categories) Directives Metacognitive expressions (“I’m lost”) Metacommunicative expressions (“Could you say that again?”) Short Responses 95% Accuracy on tutee contributions

A New Query-based Information Retrieval System(Louwerse, Olney, Mathews, Marineau, Hite-Mitchell, Graesser, 2003) Input speech act Syntactic Parser Lexicons Surface cues Frozen expressions Classify speech act QUEST’s 16 question categories, assertion, directive, other Word particles of question category Augment retrieval cues Input context:Text and Screen Search documents via LSA Select Highest Matching Document

Evaluations of AutoTutor

LEARNING GAINS (effect sizes) .42 Unskilled human tutors (Cohen, Kulik, & Kulik, 1982) .75 AutoTutor (7 experiments) (Graesser, Hu, Person) 1.00 Intelligent tutoring systems PACT (Anderson, Corbett, Koedinger) Andes, Atlas (VanLehn) 2.00 (?) Skilled human tutors

Learning Gains (Effect Sizes)

Spring 2002 EvaluationsConceptual Physics(VanLehn & Graesser, 2002) Four conditions • Human tutors • Why/Atlas • Why/AutoTutor • Read control 86 College Students

Measures in Spring Evaluation • Multiple Choice Test • Pretest and posttest (40 multiple choice questions in each) • Essays graded by 6 physics experts • 4 pretest and 4 posttest essays • Expectations versus misconceptions • Wholistic grades • Generic principles and misconceptions (fine-grained) • Learner perceptions • Time on Tasks

Effect Sizes on Learning Gains(pretest to posttest, no differences among tutoring conditions)

Fall 2002 EvaluationsConceptual Physics(Graesser, Moreno, et al., 2003) Three tutoring conditions • Why/AutoTutor • Read textbook control • Read nothing 63 subjects

Multiple Choice Scores

2002-3 EvaluationsComputer Literacy(Graesser, Hu, et al., 2003) 2 Tutoring Conditions • AutoTutor • Read nothing 4 Media Conditions • Print • Speech • Speech+Head • Speech+Head+Print 96 subjects

Deep Reasoning Questions

LATENT SEMANTIC ANALYSIS

Signal Detection Analyses

Recall, Precision, and F-measure

What Expectations are LSA-worthy? Compute correlation between: • Experts’ ratings of whether essay answers have expectation E • Maximum LSA cosine between E and all possible combinations of sentences in essay A high correlation means the expectation is LSA-worthy

Expectations and Correlations (expert ratings, LSA) • After the release, the only force on the balls is the force of the moon’s gravity (r = .71) • A larger object will experience a smaller acceleration for the same force (r = .12) • Force equals mass times acceleration (r = .67) • The boxes are in free fall (r = .21)

AutoTutor: An Intelligent Tutoring System with Mixed Initiative Dialog

AutoTutor: An Intelligent Tutoring System with Mixed Initiative Dialog

Presentation Transcript

IT/CS 803 Doctoral Tutorial Mixed-Initiative Intelligent Systems

Intelligent Tutoring Systems

An Interactive Multimedia Intelligent Tutoring System

Chemistry Studio: An Intelligent Tutoring System (Natural Language Component)

Chemistry Studio: An Intelligent Tutoring System (Natural Language Component)

Viewing AutoTutor as a step-based tutoring system

Intelligent Tutoring Systems

GTrans: Mixed-Initiative Planning System

CAPIT: An Intelligent Tutoring System for Capitalisation and Punctuation

Integrating Affect Sensors in an Intelligent Tutoring System

Intelligent Tutoring Systems

Mixed-Initiative Elements in Intelligent Tutoring Systems

Protégé as Professor: Development of an Intelligent Tutoring System With Protégé-2000

Using Collaborative Filtering in an Intelligent Tutoring System for Legal Argumentation

SPOKEN DIALOG SYSTEM FOR INTELLIGENT SERVICE ROBOTS

Intelligent Tutoring Systems

Spoken Dialogue for the Why2 Intelligent Tutoring System