440 likes | 604 Views
Computer Supported Collaborative Learning. Language Technologies Institute Carnegie Mellon University. Wednesday, March 18, 2009: Speech and NLP for Educational Applications. Research Group. Rohit Kumar. Our work: Questions. Conversational Agents (Among other things)
E N D
Computer Supported Collaborative Learning Language Technologies Institute Carnegie Mellon University Wednesday, March 18, 2009: Speech and NLP for Educational Applications
Research Group Rohit Kumar
Our work: Questions • Conversational Agents (Among other things) • To support human users at various tasks • What kind of tasks? • Supporting Learning tasks • What kind of support? (Agents ofcourse, but…) • What role? • Manifestation • How do we make sure the support is getting through? • What kind of users? • Individuals / Pairs / Groups ? • Students ? Teachers/facilitators? • What environment? • How do we build this support? • How do we evaluate if the support helps?
Our focus today: Rohit Kumar, Carolyn P. Rosé, Yi-Chia Wang, Mahesh Joshi, Allen Robinson Tutorial Dialogue as Adaptive Collaborative Learning Support Artificial Intelligence in Education 2007 Particularly interesting: Borderlining the transition of the CycleTalk project from Tutorial Dialog to CSCL
CycleTalk • Cycle Pad (Forbus et. al. 1999) • Thermodynamic cycles simulation environment • Designed to engage students in engineering design • Exploratory learning • Cycle Talk Goals: • Support engineering students learning the principles involved in designing thermodynamic cycles • Study the benefits of tutorial dialogue in an exploratory learning context
Build cycle Explore relationships between parameters Assume component parameters Incorporate conceptual understanding Generate plan to improve cycle Investigate variable dependencies Compare multiple cycle improvements Compare cycle to alternatives CycleTalk: History • Rosé et. al., CycleTalk: Towards a Dialogue Agent that Guides Design with an Articulate Simulator, 2004 • Cognitive Task Analysis • Observation: Cycle Pad’s significant pedagogical potential tends to be underutilized when students do not receive tutorial guidance
CycleTalk: History • Rosé et. al., A First Evaluation of the Instructional Value of Negotiable Problem Solving Goals on the Exploratory Learning Continuum, 2005 • 3 conditions • NPSG: Human Tutoring + written material (script) • PS: Example tracing tutors (Aleven et. al.) + script • S: Script only • Overall order: PS < S < NPSG • Significant difference between NPSG and PS • Marginal difference between PS and S • By-product: Conversational data collected from human tutoring • Difference between tutors • One of the tutors was very impatient and terse. Most of his students learnt less than expected from pre-test! >> Content/presentation of tutorial dialog is critical to success
CycleTalk: History • Since we had some conversational data now: • Analysis of Topics • Using Info-Magnets Arguello et. al. 2006 • 15 topics identified • Each human-tutoring log, covered between 4 to 11 of these topics • Correlation between coverage of these topics on learning from the human-tutoring data • So, lets get the machine to do the talking.
CycleTalk: History • Developing a Tutorial Dialog Systems for CycleTalk • 12 Knowledge Construction Dialogs authored based on this topics identified • 7 related to optimization relationships • 5 related to theoretical concepts • Authored for TuTalk • When do we bring these up? Tutor: Let’s stop for a minute and reflect on what we have seen. Having seen how changing T-Max (i.e., the Maximum Temperature at the Heater Outlet) affects the efficiency of the cycle, do you think it is a good idea in general for the Maximum Temperature to be increased and kept high? Student: yes Tutor: Okay. Let’s think now about one important consequence of keeping T-Max high. What happens to the steam quality at the turbine exit if you in-crease the maximum temperature of the cycle? Student: Increases Tutor: Right. As increasing the temperature makes the efficiency and steam quality better, we should increase the maximum temperature of the cycle. But can we keep on increasing the maximum temperature at the heater outlet indefinitely, or are there limitations to keep in mind? Student: there are limitations such as material constraints Tutor: Great. we should try to keep T-Max (i.e., the maximum temperature at the Heater outlet of the cycle) as high as possible without risking safety concerns or exceeding the maximum temperature the material can withstand. Keeping T-Max high increases efficiency and keeps the steam quality high, which are both important. This is an important principle to keep in mind while optimizing the efficiency of your cycle.
CycleTalk: History • Developing a Tutorial Dialog Systems for CycleTalk • 12 Knowledge Construction Dialogs authored based on this topics identified • 7 related to optimization relationships • 5 related to theoretical concepts • Authored for TuTalk • Integrated with the example tracing tutor at relevant nodes • So, instead of hints, a window with the dialog agent would pops up • Not a clean integration
CycleTalk: History • Ran an experiment with our first CycleTalk with Tutorial Dialog Kumar et. al., Evaluating the Effectiveness of Tutorial Dialogue Instruction in an Exploratory Learning Context, 2006 • 3 conditions • S – Script only • PSHELP – PS + Tutorial Dialog triggered in place of some hints • PSSUCCESS – PSHELP + Tutorial Dialog triggered on successful completion of certain trace nodes • Effect size • CMU: 0.35σ comparing PSHELP to PSSUCCESS • USNA: 0.25σ comparing S to PSSUCCESS • Average KCD launches: • PSHELP - 1.8 , PSSUCCESS – 2.7 • Human Tutoring – 4 to 11
CycleTalk: Fall 2006 • Collaborative Learning Setup: Implementations • Interaction Model: Keep the students more engaged: • Hinting prompts “Try to think of an idea related to manipulating a property of the pump.” • Every one minute (too many!) • Targeted prompting (based on contribution rate) • Dynamic/Adaptive support for collaboration vs. Scripting • Topic Filter for triggering of dialogs • Trained models to classify turns into topics (Taghelper) • Training data from Human-Tutoring corpus (Rosé et. al., 2005) • 2 step classification • Topic worthiness: SVM (Q: How do we know if worthy?) • Topic labeling: TDIDF Scoring • Chatting software • Agent as an observer/participant in chat
CycleTalk: Procedure • 15min: CyclePad training(led by experimenter) • 70min: Work through material on domain content and using CyclePad • 15min: Pre-Test • 42 multiple choice, 8 open response questions • Contest announcement • 25min: Review and write notes about what students learnt into the chat window • 10min: Planning/Synthesis of 2 design plans • 25min: Implementation of designs in CyclePad • 15min: Post-Test • Questionnaire (Pairs only)
CycleTalk: Experimental Design • Manipulation in step 4: Review and write notes about what students learnt into the chat window • 3x2 Full factorial design • Support • None (N): No support • Static (S): Written material (Script) • Dynamic (D): Adaptive dialog agent • Collaboration: Alone (I) / Pair (P) • CMU Sophomores: 87 students over 4 days
CycleTalk: Outcome Metrics • Pre/Post Tests • Objective type questions • Open response questions • Ability to design, implement an efficient Ranking Cycle • Questionnaire
CycleTalk: Results: Collaboration • Effect size: 0.4σ(Q: Does this mean significant?) • Reflection in pairs is more effective than reflection alone
CycleTalk: Results: Support • Positive effect of Dynamic support • Dynamic Support > No support • Effect size: 0.7σ
CycleTalk: Results: Combined • Marginal Interaction, p=.07 • Pair+Dynamic > Individual+No Support, 1.2σ • Pair+Static > Individual+No Support, .9σ • Individual+Dynamic > Individual+No Support, 1.06σ
CycleTalk: More Results • Open response questions • Advantage for dialog based support, but not collaboration • Effect size: 0.5σ (Simpler ANCOVA model) • Practical Assessment: • No significant differences • 91% students built one fully defined cycle • 64% built two • Questionnaire: • Dynamic support student rate high on benefit, but low on engagement when in pairs • Collaboration & Dynamic support not working together?? • Desirable Difficulty? (Ref: Robert Bjork)
CycleTalk: Observations • Individual interaction with agents • Highly tutor directed • Students rarely ignore tutor prompts • More complicated dynamics with pairs • Students talk around tutor agent (Q) • Tutor agent is an interruption • Students treat agent contributions like hints • Evidence that they read although they often don’t respond • Frustration
AIED 2007: Your questions (Q) • Is this reflection? • Dialogs were about material already given to them which they review in step 2 • Whether students were receiving instruction twice? • Multi-Party Turn-Taking • This was a problem back then • Recent attempts to improve this • And more needs to be done… • Why do students ignore the Tutors? • Pairwise Tukey Posthoc analysis • Difference of means/SE > qcritical
That was Fall 2006: Since then… • Making the agent more “Social” • Kumar et. al., SLaTE 2007 • Engaging students through small talk • Promoting Collaboration • Unpublished Manuscript (Kumar et. al.) • Attention Grabbing • Chaudhuri et. al., AIED 2009 • Pointers • Basilica • Architecture for developing conversational agents • Virtual Environments • ConcertChat • SecondLife • More than 2 students? • Recent work in a freshmen Mechanical Engineering class
Kumar et. al., SLaTE 2007 • Mathematics Problem Solving • Fractions • Addition, Subtraction, Multiplication, Division CTAT Problem Solving Interface with Cognitive Tutors based feedback [Aleven, Koedinger] (shared by students) Conversation Interface (shared by students & conversational agents)
Kumar et. al., SLaTE 2007: Social Strategy Tutor:Student1, If you had to choose between a long flight and a longcar ride, which seems less uncomfortable? Student1: I’ll take the car ride Tutor:Ok Student1. Tutor:Student2, Which are more entertaining – books or movies? Student2: definitely books! Tutor:Ok Student2. Tutor:Please work on the problem shown in the side panel. Motivated by the intention of engaging students • By showing interest in their personal preferences • Goal of this social conversation is to make the students feel that they worked together to construct the problem statement • Comes up every time the students are about to start solving a new problem • Related: (Bickmore & Cassell) • SmallTalk by Embodied Conversational Agent REA Example Jan packed several books to amuse herself on a long car ride to visit her grandma. After 1/5 of the trip, she had already finished 6/8 of the books she brought. How many times more books should she have brought than what she packed?
Kumar et. al., SLaTE 2007: Questionnaire Significant (Effect Size = 1.15) Students perceived higher help offering by their partners in the Experimental condition Significant (Effect Size = 1.18) Students perceived they offered more help to their partners in the Experimental condition
Kumar et. al., SLaTE 2007: Results • Observations from Conversation Analysis • Average number of Help Provisions not significantly different across conditions • More help related episodes per problem in the Experimental condition Mean (Control) = 0.30 Mean (Experimental) = 0.69 F(1, 15) = 16.8 p < 0.001 • More episodes of Deny Help in Control condition Mean (Control) = 40.2 Mean (Experimental) = 24.7 F(1, 62) = 3.46 p = 0.001 • Students displayed more negative attitude in Control conditions • Insults (“you stink”, “stupid”) occurred only in Control condition
That was Fall 2006: Since then… Making the agent more “Social” Kumar et. al., SLaTE 2007 Engaging students through small talk Promoting Collaboration Unpublished Manuscript (Kumar et. al.) Attention Grabbing Chaudhuri et. al., AIED 2009 Pointers Basilica Architecture for developing conversational agents Virtual Environments ConcertChat SecondLife More than 2 students? Recent work in a freshmen Mechanical Engineering class
Two Motivational Prompts At 2 minute mark At 30 minute mark Tutor Tutor Tutor St13BP Tutor Tutor St13BP Tutor St13BP Tutor St13BP Tutor Tutor Tutor There will be more potential for Cooling. If there is more potential for cooling, is there more or less potential for power generation? Solving this problem can be a lot of fun if you cooperate well together Consider the effect of increasing Qin which is heat input to a cycle. First, let’s consider what happens to power out when Qin is increased. What happens to power out when steam is cooled? because that increases power Winning isn’t everything. Don’t worry. Be Happy. Power is generated when steam is cooled. If there is more heat input, will there be more or less potential for cooling? Now might be a good time for reflection. Power is generated when steam is cooled. If there is more heat input, will there be more or less potential for cooling? well the idea is to increase the heat in because that increases power Now might be a good time for reflection. There will be more potential for Cooling. If there is more potential for cooling, is there more or less potential for power generation? Consider the effect of increasing Qin which is heat input to a cycle. First, let’s consider what happens to power out when Qin is increased. What happens to power out when steam is cooled? well the idea is to increase the heat in Instructive Conversationwith Attention Grabbing Instructive Conversationwith Attention Grabbing CycleTalk Fall 2007
CycleTalk Fall 2007: Conversation Analysis More Attention to the Tutor More Relevant Responses Numbers here are:Average ( Percentage of Instructional turns each student responded to) i n t r o d u c t i o n : e x p e r i m e n t 1 : e x p e r i m e n t 2 : d i r e c t i o n s
CycleTalk Fall 2007: Conversation Analysis i n t r o d u c t i o n : e x p e r i m e n t 1 : e x p e r i m e n t 2 : d i r e c t i o n s
CycleTalk Fall 2008 (Chaudhuri et. al. ‘09) B6: so i think the best thing to go with first is to increase the pressure entering the connector Tutor: You might want to review what you saw earlier in the sensitivity analysis investigating the effect of changing P-max on steam quality and efficiency, from the top half of page 13 in your book. Tutor: type HELP WITH CHANGEPMAX if you want to discuss it with me. B1: help with changepmax Tutor: Remember the sensitivity analysis you have completed where you manipulated P-Max and observed it's effect on the efficiency of the cycle. You can refer to the figure that I have pasted onto your whiteboard for your convenience. Having seen how changing P-Max, the Maximum Pressure at the Heater Inlet, effects efficiency and Steam, do you think as a general principle the Maximum pressure should be increased ? B6: Yes Tutor: So the Efficiency of a Cycle increases by increasing the maximum pressure of the cycle. In addition to the effect on efficiency, what happens to the steam quality at the turbine exit if you increase the maximum pressure of the cycle ? B6: Decreases Tutor: So let's think for a minute about these competing effects: Increasing P-Max makes the efficiency better but the steam quality degrades. Is the steam quality at the turbine exit acceptable at the maximum allowable pressure within safety limits?
CycleTalk Fall 2008: Experiment • Manipulation (2x2): • Pointer Hints: Yes/No • Dialog Support: Yes/No • Results • Higher learning gains for Pointer+Dialog condition • Pointer+Dialog vs Dialog: 0.8σ • Pointer+Dialog vs None: 0.6σ • Pointer vs None: 0.35σ • Few dialogs in Pointer + Dialog condition compared to Dialog only condition • Too many dialogs distracting?
That was Fall 2006: Since then… Making the agent more “Social” Kumar et. al., SLaTE 2007 Engaging students through small talk Promoting Collaboration Unpublished Manuscript (Kumar et. al.) Attention Grabbing Chaudhuri et. al., AIED 2009 Pointers Basilica Architecture for developing conversational agents Virtual Environments ConcertChat SecondLife More than 2 students? Recent work in a freshmen Mechanical Engineering class
Basilica • Multi-Expert model of building conversational agents • Terminology: • Components, Actors, Filters, Events, Connections • Actors: Generate user perceivable events • Filters: (Everything else, mostly): Interpret events generated by other components • Data is encapsulated as Events
Student1 Student2 ConcertChat Server OutGoingMessage SpoolingFilter TextMessageEvent Presence Actor Hinting Actor Prompting Actor TuTalk Server CCText Filter Channel Filter TurnTaking Filter Hinting Filter Launch Filter TextMessageEvent ProduceHintEvent LaunchEvent TextMessageEvent x x TookTutorTurnEvent TextMessageEvent TextMessageEvent LaunchEvent TextMessageEvent TextMessageEvent x TextMessageEvent TextMessageEvent TextMessageEvent ProduceHintEvent LaunchEvent TakeTutorTurnEvent Attention Grabbing Actor Tutoring Actor GrabAttentionEvent StartTutoringEvent TakeTutorTurnEvent TextMessageEvent TextMessageEvent TutoringStartedEvent TookTutorTurnEvent DoneTutoringEvent Polling Attention Grabbing Filter Tutoring Filter TakeTutorTurnEvent TookTutorTurnEvent LaunchEvent TutoringStartedEvent DoneTutoringEvent AttentionGrabbedEvent GrabAttentionEvent TextMessageEvent GrabAttentionEvent AttentionGrabbed- Event TakeTutorTurnEvent TookTutorTurnEvent StartTutoringEvent GrabAttentionEvent Basilica
Basilica • Novel Architecture • Re-usable components • Rapid prototypes • Easy integration of the same agent with many environments • Incremental developments of components • Meta-architecture for bringing together other dialog management components
Conversational Agents in Second Life MIDDLEWARE Session1 Session2 Session3 Session4 Session5 … Object 1 Internal Representation Object 2 Internal Representation TRANSLATION O B J E C T 1 O B J E C T 2 Message Receiver Message Receiver I N T E R F A C E I N T E R F A C E Message Queue Message Queue HTTP HTTP
That was Fall 2006: Since then… Making the agent more “Social” Kumar et. al., SLaTE 2007 Engaging students through small talk Promoting Collaboration Unpublished Manuscript (Kumar et. al.) Attention Grabbing Chaudhuri et. al., AIED 2009 Pointers Basilica Architecture for developing conversational agents Virtual Environments ConcertChat SecondLife More than 2 students? Recent work in a freshmen Mechanical Engineering class
Supporting Groups • Mechanical Engineering Freshmen class • Sort of repeat the process of CycleTalk • Initial data collection done with agents though • One advantage of Basilica here • Observations: (3 out of 6 sessions) • New vocabulary collection (for dynamic triggers) • Lots of Sarcasm, Teasing, Cursing, Discontent, Silliness • Some positiveness too • Abuses towards Tutor • Opportunities identified that next version of tutor can use • Current irresponsiveness • frequent questions/concepts • GRASP • Supporting teachers/facilitators for long-term group assessment
Our work: Questions • Supporting Learning tasks • What kind of tasks? • What kind of support? (Agents ofcourse, but…) • What role? • Manifestation • How do we make sure the support is getting through? • What kind of users? • Individuals / Pairs / Groups ? • Students ? Teachers/facilitators? • What environment? • How do we build this support? • How do we evaluate if the support helps?
Done: Thanks for tuning in. Most questions are good questions. So, Please Ask. i n t r o d u c t i o n : e x p e r i m e n t 1 : e x p e r i m e n t 2 : d i r e c t i o n s