Embodied Conversational Agents: A Case Study of Freudbot Bob Heller, PhD Athabasca University

Embodied Conversational Agents: A Case Study of Freudbot Bob Heller, PhD Athabasca University November 3, 2004

Acknowledgements Mike Proctor – AIML programmer Dean Mah – Web implementation Billy Cheung – Graphics, test chatter Lisa Jewell – Chat log analysis, content developer, test chatter Julianna Charchun – Chat log analysis Jude Onuh – AIML programmer

Embodied Conversational Agents Definitions • Embodiment in Conversational Interfaces: REA (Cassel et al., 1999) • Embodied Conversational Agents (Cassel, Sullivan, Prevost, & Churchill, 2000) • FMTB model Vos (2002) offers 5 features of ECA • Human like appearance • Body used for communication purposes • Natural communication protocols • Multimodality • Social role

Embodied Conversational Agents Anthropomorphic Agents Animated Interface Agents Animated Pedagogical agents Pedagogical Agent Persona Intelligent Tutoring Systems • AutoTutor (Graesser et al) http://www.autotutor.org/index.htm Chatterbots or Chatbots - Weizenbaum’s (1966) Eliza

Embodied Conversational Agents Why? • primacy of conversation • Constructivist theory • The Media Equation • Persona effect • cognitive load

Embodied Conversational Agents Richard Wallace and A.L.I.C.E. • Artificial Linguistic Internet Computer Entity http://alicebot.org/ • 3 time winner of the Loebner Contest (the holy grail for chatbots) http://www.loebner.net/ • AIML – Artificial Intelligence Markup Language http://www.aimlbots.com/ • PandoraBots http://www.pandorabots.com

Embodied Conversational Agents ‘Theory’ behind ALICE • pattern matching • Zipf distribution • Iterative

Freudbot 1 Why Freud? • Initial plan of deployment • The famous personality application • Emile http://www.hud.ac.uk/hhs/research/emile/emileframeset.htm • Shakespeare http://www.pandorabots.com/pandora/talk?botid=c6937cfb3e354738 • Hans Christian Anderson http://www.niceproject.com/about/ • John Lennon

Freudbot 1 Developing the AIML • Narrative structure • Test chatters • How much ALICE?

Freudbot 1 Research Questions • Is it worth it? • Is ‘chattiness’ related to the subjective evaluation of chat experience? • Are there individual difference variables that are related to measures of chat performance/experience?

Freudbot 1: Methodology • Online Recruitment • restricted to psychology students • Incentive (1/30 chance at $300) • Random assignment to bot type • Controlled Chat • automatically directed to questionnaire after 10 mins of chat

Freudbot 1: Participants (N=67) n Percent GenderMen 12 18% Women 55 82% Age Distribution 18-22 6 9% 23-27 15 22% 28-32 11 16% 33-37 7 10% 38-42 15 22% 42+ 13 19% Student Status Full-time 27 40% Part-time 35 52% Non-student 5 8% Self-rated academic Below avg 0 0% ability Average 13 19% Above avg 39 58% Excellent 15 22%

Is it worth it? • self-report data* Would you chat again? Yes No (n=30) (n=35) 2.7 1.8 3.4 1.6 3.2 1.8 3.4 1.9 3.4 2.1 3.6 2.2 4.1 2.8 Mean Useful 2.2 Recommend 2.4 Overall 2.4 Enjoyable 2.6 Engaging 2.7 Memorable 2.8 Expansion 3.4 * 5 point scale

Is it worth it? Best Features Interactivity 16 Able to ask questions with answers 16 Learning about Freud & theories 13 Simplicity/ease of use 5 Entertaining/humorous 5 Thought provoking 5 No good features 5 Technological features of Freudbot 4 Potential to Freudbot 4 Alternative learning style 3 Novelty/uniqueness of Freudbot 3 Tricking Freudbot 2 Unpredictable 2 Worst Features Repetition 33 Unable to answer questions 23 Conversation did not flow 12 Limited knowledge base 10 User needed prior knowledge 3 User was uncertain about what to do 3 Not an effective learning tool 3 Conversation was too short 1 No sound 1

Is it worth it? • Chat logs Mean Range Number of Exchanges 31.0 5-82 Mean Proportion of on-task responses by participant* .60 questions .37 comments* .23 * correlated with a composite measure of self rated chat experience Proportion of repetitions by Freudbot .25 Proportion of non-sensical by Freudbot .39

Chattiness? FreudAlice JustFreud n=35 n=32 Useful 2.2 2.3 Recommend 2.5 2.4 Overall 2.5 2.4 Enjoyable 2.7 2.6 Memorable 3.0 2.7 Engaging 2.8 2.7 Expansion 3.3 3.5 # of Exchanges 32.2 29.7 On task Response* .56 .64 * -significant difference btw groups

Individual difference variables? • demographic • Gender • Age • Student status* • Self-rated academic ability • computer experience & self-rated skill • academic background • # of university courses • # of distance ed courses* • # of psychology courses • Rated importance of Freud*

Individual difference variables? • attitudes towards technology and education • Positive aspects of on-line activities • Independent Learner • negative aspects of on-line activities*

Freudbot 1 Summary • Is it worth it? • worth another look • Is ‘chattiness’ related to the subjective evaluation of chat experience? • ‘Chattiness’ is not the right level • Nass and Reeves (1998) • Are there individual difference variables that are related to measures of chat performance/experience? • some relations that make sense and others that don’t

Freudbot 2 Research Goals 1. Improve Performance • Fix repetition problem • Topic tags • More content 2. Replication 3. Instructional Set 4. Future Development

Freudbot 2:Methodology http://psych.athabascau.ca/html/Freudbot/test.html • online recruitment, incentive, & controlled chat identical to Freudbot 1 • random assignment to instructional set • similar questionnaire with additional questions on applications and improvements

Participants (N=55) n Percent GenderMen 10 18% Women 45 82% Age Distribution 18-22 7 13% 23-27 17 31% 28-32 7 13% 33-37 11 20% 38-42 6 11% 42+ 7 13% Student Status Full-time 26 47% Part-time 28 51% Non-student 1 2% Self-rated academic 0-50 0 4% ability 50-65 2 4% 66-79 11 20% 80-89 30 55% 90+ 10 18%

Improvement? Would you chat again? Yes No (n=37) (n=18) 3.3 2.4 3.4 1.7 3.4 2.2 3.3 2.3 3.5 2.2 3.6 2.1 4.4 3.3 • self-report data (5 point scale) Freudbot 1 Freudbot 2 Useful** 2.2 3.0 Recommend** 2.4 2.9 Overall** 2.4 3.0 Enjoyable 2.6 3.0 Engaging** 2.7 3.1 Memorable 2.8 3.1 Expansion** 3.4 4.1 ** - statisically significant

Improvement? • Chat logs Mean Range Number of Exchanges 28.4 3-115 Mean Proportion of on-task responses by participant* .90 questions .36 comments .48 * correlated with a composite measure of self rated chat experience Proportion of appropriate responses by Freud .60

Replication? • Demographic • Gender* • Age • Student status* • Self-rated academic ability • computer experience • academic background • # of university courses • # of distance ed courses • # of psychology courses • Rated importance of Freud*

Replication? • attitudes towards technology and education • Positive aspects of on-line activities • Independent Learner • negative aspects of on-line activities*

Instructional Set? Brief Set Elaborate Set n=27 n=28 Useful 3.1 2.9 Recommend 2.8 2.9 Overall 2.9 3.1 Enjoyable 2.9 3.0 Memorable 3.2 3.0 Engaging 3.0 3.3 Expansion 3.9 4.2 # of Exchanges 25.3 31.3 On task Response .90 .90

Future Development? Freudbot Improvements Mean* Chat behaviour 4.2 Audio Response 3.1 Voice Recognition 2.6 Synchronization 2.5 Animation/movment 2.3 * 5-point scale Other Applications Mean* Practice quizbot 4.1 Famous personality 4.1 Course content 3.4 Chatroom 3.3 Course Admin 3.2

Freudbot 2: Summary 1. Improvement - yes, but clearly room for more 2. Replication - some 3. Instructional Set - no effects 4. Development

Future Direction • Haptek Freud • Animacy/agency hypothesis http://psych.athabascau.ca/html/Freudbot/haptek.html • Piagetbot (Support from MCR) • learning outcomes • Skinnerbot (Lyle Grant) • Coursebot • Quizbot

Questions?

Embodied Conversational Agents: A Case Study of Freudbot Bob Heller, PhD Athabasca University