290 likes | 385 Views
Adapting to Student Uncertainty Improves Tutoring Dialogues. Kate Forbes-Riley and Diane Litman University of Pittsburgh Pittsburgh, PA USA. Outline. Overview System: Original and Adaptive Versions Evaluation of Uncertainty Adaptations Conclusions, Future Work. Background.
E N D
Adapting to Student Uncertainty Improves Tutoring Dialogues Kate Forbes-Riley and Diane LitmanUniversity of PittsburghPittsburgh, PA USA
Outline • Overview • System: Original and Adaptive Versions • Evaluation of Uncertainty Adaptations • Conclusions, Future Work
Background Student uncertainty is of interest in tutoring community Correlates with learning (Craig et al., 2004) Co-occurs with incorrectness (Bhatt et al., 2003) Annotated and detected (D’Mello et al., 2008) Few computer tutors have evaluated substantive responses to uncertainty Human-based positive feedback responses improved satisfaction (Tsukahara and Ward, 2001) and persistence (Aist et al., 2002) Human-based substantive responses didn’t improve learning, but limited detection scheme (Pon-Barry et al., 2006)
This Paper We show responding to uncertainty with additional content can significantly improve computer tutoring performance 2 uncertainty adaptations evaluated in Wizard of Oz experiment Performance gains measured for learning efficiency and user satisfaction
Normal (non-adaptive) Computer Tutor ITSPOKE (Intelligent Tutoring Spoken Dialogue System) Back-end: text-based Why2-Atlas (VanLehn, Jordan, Rosé et al., 2002) Tutors 5 qualitative physics problems Dialogue Format: Question – Student Answer – Response Response Types: to Corrects (C): positive feedback (e.g. “Fine”) to Incorrects (I): negative feedback (e.g. “Well…”) and Bottom Out: correct answer with reasoning (easier) Subdialogue: questions walk through reasoning (harder)
Tutoring Theory:Uncertainty and Incorrectnessboth signal Learning Impasses (opportunities to better learn concepts (VanLehn et al., 2003)) Our Prior Work: Rank correctness (C, I) + uncertainty (U, nonU) states in terms of impasse severity State: I+nonU I+U C+U C+nonU Severity: most less least none Adaptation Hypothesis: ITSPOKE already provides content to resolve I impasses (I+U, I+nonU), but it ignores one type of U impasse (C+U) Performance improvement if ITSPOKE provides additional content to resolve all impasses Adaptive Computer Tutor(s)
Simple Adaptation Provide same substantive response for all 3 impasses Complex Adaptation Provide different substantive responses for the 3 impasses Acknowledge student uncertainty in feedback Two Uncertainty Adaptations
For C+U, I+U, I+nonU answers ITSPOKE gives same content with same dialogue act ITSPOKE gives feedback on (in)correctness SimpleAdaptation to Uncertainty
SimpleAdaptation Example TUTOR1: By the same reasoning that we used for the car, what’s the overall net force on the truck equal to? STUDENT1: The force of the car hitting it?? [C+U] TUTOR2: Fine. [FEEDBACK]We can derive the net force on the truck by summing the individual forces on it, just like we did for the car. First, what horizontal force is exerted on the truck during the collision? [SUBDIALOGUE] Same TUTOR2 subdialogue if student was I+U or I+nonU
Depending on if answer is C+U, I+U, I+nonU: ITSPOKE gives same content but varies dialogue act Based on human tutor responses significantly associated with C+U, I+U, I+nonU answers ITSPOKE gives complex feedback on uncertainty and (in)correctness Based on empathetic computer tutor literature (Wang et al., 2005; Hall et al., 2004; Burleson et al., 2004) ComplexAdaptation to Uncertainty
TUTOR1: By the same reasoning that we used for the car, what’s the overall net force on the truck equal to? STUDENT1: The impact force of the car on the truck?? [C+U] TUTOR2: That’s exactly right, but you seem unsure, so let’s sum up. [FEEDBACK]The net force on the truck is equal to the impact force on it. We can prove this just like we did for the car. First, we know that gravity and the normal force on the truck must cancel each other, otherwise the truck would not be at rest vertically. Second, we know that the impact force is the only horizontal force exerted on the truck. [LONG BOTTOM OUT] ComplexAdaptation Example: C+U
TUTOR1: By the same reasoning that we used for the car, what’s the overall net force on the truck equal to? STUDENT1: Zero?? [I+U] TUTOR2: That’s not correct, but don’t worry. You seem to be aware of your mistake. Let’s resolve it. [FEEDBACK]The net force on the truck is equal to the impact force on it. Let’s walk through this answer step by step. [SHORT BOTTOM OUT] We can derive the net force on the truck by summing the individual forces on it, just like we did for the car. First, what horizontal force is exerted on the truck during the collision? [SUBDIALOGUE] ComplexAdaptation Example: I+U
TUTOR1: By the same reasoning that we used for the car, what’s the overall net force on the truck equal to? STUDENT1: The force of gravity. [I+nonU] TUTOR2: I’m sorry, but I see an error in your answer. Let’s fix it. [FEEDBACK]We can derive the net force on the truck by summing the individual forces on it, just like we did for the car. First, what horizontal force is exerted on the truck during the collision? [SUBDIALOGUE] ComplexAdaptation Example: I+nonU
Experimental Design Wizard of Oz version of ITSPOKE Human wizard performed speech recognition, natural language understanding, and annotated correctness and uncertainty 4 Conditions Simple Adaptation: used same response for all impasses Complex Adaptation: used different responses for each impasse Normal Control: used original system (no adaptation) Random Control: gave Simple Adaptation to random 20% of correct answers (to control for additional tutoring) Prediction: Complex Adaptation > Simple Adaptation > Random Control > Normal Control Procedure: reading, pretest, 5 problems, survey, posttest
Learning efficiency: Amount of learning achieved in a given amount of tutoring (e.g., Ringenberg and VanLehn, 2006) Learning gain / total tutoring time in minutes User Satisfaction: subjective student perceptions of system performance as measured by survey (e.g., Baylor et al., 2003; Walker et al., 2001) Total survey score Score for each survey question For each metric: 1-way ANOVA with condition as between-subjects factor Paired contrasts tests for each pair of conditions Evaluation Metrics
Survey Tutoring Uncertainty Spoken Dialogue
Learning Efficiency Results F(3, 77) = 3.56, p = 0.02 • Given same amount of tutoring time, Simple Adaptation yields more student learning than either NormalControl or Complex Adaptation • Results also hold using raw learning gain, and total number of student turns
Survey Results F(3, 77) = 2.69, p = 0.05 • Spoken Dialogue Question 13: “It was easy to understand the tutor” • Students perceive tutor in SimpleAdaptation as hard to understand • May reflect student confusion as to why Simple Adaptation was treating C+U answers as incorrect – students already uncertain at this point
Satisfaction-Learning Correlations Survey results suggest no strong student preference for either uncertainty-adaptive ITSPOKE tutoring system Is there a relationship between student preferences and learning? E.g., subjects who prefer Complex Adaptation may learn more from it than those who don’t prefer it Mixed prior results(e.g., Moreno et al., 2002; Rotaru, 2008) Pearson’s correlation between each user satisfaction metric and posttest (controlled for pretest) over all ITSPOKE tutors (conditions) and for each tutor
Satisfaction-Learning Correlations:Simple Adaptation Tutoring Question 7: “The tutor helped me concentrate.” (R = 0.482, p = 0.037) Those who perceived more concentration learned more Uncertainty Question 12: “The tutor’s responses decreased my uncertainty about my understanding of the content.” (R = 0.432, p = 0.065) SimpleAdaptation “works”: even if not most preferred overall, it is decreasing uncertainty while increasing learning
Discussion Why didn’t SimpleAdaptation and ComplexAdaptation outperform Random Control? RandomControl adapted to some C+U, diminishing differences Adapting to C+nonU may increase certainty Why didn’t ComplexAdaptation outperform Simple Adaptation? Complex Adaptation’s feedback and content elements may differ in effectiveness Complex Adaptation’s human-based content responses were based on frequency, not effectiveness
Conclusions Adapting to student uncertainty during wizarded computer tutoring improves learning efficiency and user satisfaction SimpleAdaptation improved learning efficiency, had positive correlation between learning and student perception of decreased uncertainty ComplexAdaptation showed trend for improvement on user perception of tutor response quality
Current and Future Work User Modeling (Interspeech 2009) and Metacognitive data analysis Investigate other approaches for developing complex uncertainty adaptations reinforcement learning dialogue act-learning correlations Replicate analysis using recently collected data from fully automated ITSPOKE
Questions? Further Information? web search: ITSPOKE Thank You!
Simple Adaptation: For CU, IU, InonU answers: ITSPOKE gives same content with same dialogue act ITSPOKE gives feedback on (in)correctness Complex Adaptation: Depending on if answer is CU, IU, InonU: ITSPOKE gives same content but varies dialogue act Based on human tutor responses significantly associated with CU, IU, InonU answers ITSPOKE gives complex feedback on affect and (in)correctness Based on empathetic computer tutor literature (Wang et al., 2005; Hall et al., 2004; Burleson et al., 2004) Two Uncertainty Adaptations
Tutoring Theory: Uncertainty and Incorrectness both signal a Learning Impasse:opportunity to better learn concept (VanLehn et al., 2003) Uncertainty indicates impasse perceived, so rank correctness (C,I) + uncertainty (U, nonU) states in terms of impasse severity: State: InonU IU CU CnonU Severity: most less least none Adaptation Hypothesis: ITSPOKE already provides additional content to resolve I impasses (IU, InonU), but it ignores one type of U impasse (CU) Performance improvement if ITSPOKE provides additional content to resolve all impasses Two Uncertainty Adaptations
Satisfaction-Learning Correlations Normal: “The tutor worked the way I expected it to.” (R = -0.382, p = 0.096) Those who perceived a hard time using system learned more Random: “It was easy to learn from the tutor.” (R = 0.401, p = 0.089) Those who perceived an easy time using system learned more Simple: “The tutor helped me to concentrate.” (R = 0.482, p = 0.037) Those who perceived more concentration learned more “The tutor’s responses decreased my uncertainty about my understanding of the content.” (R = 0.432, p = 0.065) Simple “works”: even if not most preferred overall, it is decreasing uncertainty while increasing learning
Efficiency (TOT) Differences F(3, 77) = 0.774, p = 0.51