310 likes | 323 Views
Explore cooperation in dialogue systems through user behavior interaction and assessment in various focus conditions. Study includes error-resolution techniques, phonetic analysis, emotional outcomes, and human-machine interaction dynamics.
E N D
Evoking and Assessing Cooperation in Dialogue (Systems) Talk in the EGK-colloquium Dec 3, 2001 Bettina Braun
Content • Introduction: what I did so far... • What is cooperation? • in general terms • in phonetic terms • Plans from now on: user behaviour in interaction with dialogue systems and human operators • Experiment: cooperative operators, ... Bettina Braun, Institiute of Phonetics
So far: Error-resolution in computer-directed speech • Hyperarticulation to correct machines • Increased error-rates in ASR-systems • Find solutions for ASR to cover changes in • F0-movement • intensity/spectral tilt • duration • segmental realisation, e.g. formant values, Bettina Braun, Institiute of Phonetics
Pilot study in speech-operated lift • Evoke corrections in several focus conditions, triggered by system’s output • broad focus • narrow focus • contrastive focus • Broad focus not possible in that domain! • Concept of contrastive vs. narrow very vague • For phonetic analysis too poor signal quality (noise, reverberation, off-talk, ...) Bettina Braun, Institiute of Phonetics
Loophole -- possible ways to go... • New recordings with more elaborated system to evoke hyperarticulation in several contexts (and focus conditions) • already quite thoroughly analysed • Except for hyperarticulation various forms of “emotions” and attitudes (uncertainty, anger), e.g. calling contour in addressing the lift • Analysis of global behaviour of users, because... Bettina Braun, Institiute of Phonetics
Communicative situation • Computer-directed speech differs from “normal” conversation (back-channels, interruptions, visual co-presence) • Conceptual processing is partner-dependent (e.g. mutual beliefs, grounding) • Error-resolution demands cooperation! • Emotional component (relationship to computer, wrt. greetings, politeness, etc.) Bettina Braun, Institiute of Phonetics
Now: Analysis of human behaviour • Interest in human behaviour when interacting with machines (as opposed to human partners) • How would “human operators” act? • How can cooperation be evoked in dialogue? • How can cooperation be assessed (in dialogue systems)? Bettina Braun, Institiute of Phonetics
“Definition” of Cooperation (Allwood 76) • Take partner in cognitive consideration • Joint purpose (of understanding) • Ethical consideration • not to hurt each other • not to force each other • facilitate rational behaviour • Trust to act in accordance with above points Bettina Braun, Institiute of Phonetics
Phonetic Realisation of Cooperation • Expressing degree of “understanding”, either by appropriate dialogues act or prosody • cinema info Sie möchten in N. den Film “..” SEHEN? • train timetable Wann möchten Sie von SB nach HH fahren? • Expressing what part of information was not understood • Shift focus to appropriate position • deaccent understood information Importance? Bettina Braun, Institiute of Phonetics
Existing dialogue systems (I) • Technique: finite-state-automata, transitions and speech output (mainly canned speech) determined by key-word or key-phrase spotter • obligatory mapping to items in lexicon • Difference in how information slots are filled • Barge-in not always allowed • Speech output not prosodically adapted wrt. • given/new information • certainty in checks/query-YN Bettina Braun, Institiute of Phonetics
Existing dialogue systems (II) • Often (?) successful interaction, because • users are very adaptive • only limited domain applications • BUT: user satisfaction also important • certainty of checks (which check wrong info) • information packaging mostly inappropriate • user frustration ... Bettina Braun, Institiute of Phonetics
How to deal with frustrated users? • Automatic detection of anger to pass over to human operator • very difficult task (cf. Batliner et al. 2000) • Different strategies of clarification initiations (e.g. excuses by system, cf. Fischer 2000) • Approach here: Find out which “system behaviour” is judged positively • appropriate • cooperative Bettina Braun, Institiute of Phonetics
Overall experimental design • Investigate user behaviour in 4 conditions • WoZ with synthesised speech output • WoZ with “canned” speech • human operator with restricted set of utterances • (human operator with free interaction) • Ask subjects to judge appropriateness of each “system’s” behaviour (set of subjects!!!) Bettina Braun, Institiute of Phonetics
Experimental design (overview): How do human “bankers” behave? • Production study with 12 human operators • studio recordings of all possible utterances (16) • interactive condition with restricted set of utterances and believed customers • Evaluation of dialogue structure • especially check vs. query-YN • certainty/uncertainty • after previous correction by customer • Perception test Bettina Braun, Institiute of Phonetics
Why this setting? • Banking scenario very familiar • Domain small, but extendable (transferals) • Studio recordings as “no context condition” • Restricted set of utterances to simulate real systems and to exclude other factors that signal cooperation • adaptation to lexical and syntactic forms • adaptation to utterance length • different customers to exclude adaptation to voice characteristics, f0,... Bettina Braun, Institiute of Phonetics
Collecting and preparing “customer” data • Limited domain WoZ system with bad performance Bettina Braun, Institiute of Phonetics
Collecting and preparing “customer” data • Limited domain WoZ system with bad performance Pardon? Deposit or withdraw? How much withdraw? How much deposit? Check amount & action Check amount & action Please take your money Please put money in case Bettina Braun, Institiute of Phonetics
Collecting and preparing “customer” data • Limited domain WoZ system with bad performance • Data collection: 5 customers (4 tasks) • Parts of these signals were distorted • amount of money (I want to withdraw 2x DEM) • action: withdraw or deposit (I want to xxx 5000 DEM) Bettina Braun, Institiute of Phonetics
Constructing possible dialogues... • Dialogue interaction with 7 customers • Thinks to consider: • (re)action of operator must be predictable • customer sound files must be prepared for each possible reaction • customer’s utterances must fit (prosodically) into the dialogue (e.g. hyperarticulation in corrections) • Problem: customer reactions for free interaction can not be simulated! Bettina Braun, Institiute of Phonetics
Instruction of operators as bank assistants • Utterances restricted to set of utterances Pardon? Deposit or withdraw? How much withdraw? How much deposit? Check amount & action Check amount & action Please take your money Please put money in case Bettina Braun, Institiute of Phonetics
Stimuli and (expected) reactions • Amount not mentioned/distorted • “I want to withdraw money” • “I want to withdraw xxxx” • How much do you want to withdraw? (But with differing intonation) • “I want to withdraw DEM 2xx” • You want to withdraw DEM 2000 (200), then? (query-YN-like, because of uncertainty) Inappropriate intonation Certain check Bettina Braun, Institiute of Phonetics
Stimuli and (expected) reactions • Action distorted • “I want to xxx money” • “I want to xxx DEM 5000” • Do you want to deposit or withdraw money? (Do you want to deposit or withdraw DEM 5000 is no possibility) • How much do you want to deposit/withdraw? • Pardon? Bettina Braun, Institiute of Phonetics
Stimuli and (expected) reactions • Check after user correction • apologising impression • Query-W after user correction • action distorted: shift from question particle to deposit or withdraw • Repeated misunderstanding • change intonation of “pardon?” did not happen add explain to “pardon?”, react with a check Bettina Braun, Institiute of Phonetics
Reasons for not being cooperative • Humans are not cooperative (v. map task) • Situation too unnatural • no real customers • permitted utterances normally not actively used • Task too complicated • too less time to get used to utterances, i.e. finding the fitting one poses problems (mapping problem) • signal quality of customers too poor • Operators indeed understood every customer Bettina Braun, Institiute of Phonetics
Possible solutions • Assessment of signal quality: ratings from 1 to 6 after each dialogue (=> very poor rates) • Increase familiarity with utterances: • fist: role games with experimenter (maybe with changed roles) • first ratings of appropriateness, cooperation, etc. of other operators • two runs, discussing problems of first run in between (real customers no more believed) Bettina Braun, Institiute of Phonetics
Which “behaviour” is appropriate for dialogue systems? • Should systems behave like humans? • Assessment of cooperation, appropriateness, etc. under 2 conditions: • assuming that human-human interaction is judged • assuming that man-machine interaction is judged Bettina Braun, Institiute of Phonetics
Assessment of cooperation • User satisfaction in interaction with dialogues (e.g. PARADISE, M.Walker): mixed-initiative vs. system-initiative,... • Assessing appropriateness, naturalness of synthetic speech output (cf J.House, S.Hawkins) • How to tease dialogue behaviour and prosodic cues to cooperation apart? Bettina Braun, Institiute of Phonetics
Perception test I • Comparison between studio and interactive behaviour of same dialogue with respect to • cooperation, friendliness • naturalness, appropriateness of operator • only possible for selected cooperative utterances! • Interactive conditions contain a lot of extralinguistic noise (breathing, smacking, long pauses) => “clean” them before • interactive: studio: Bettina Braun, Institiute of Phonetics
Perception test II • Comparison of dialogues with same goal but different operators to evaluate different (prosodic and non-prosodic ???) strategies • too many variables: • dialectal influence • male/female • voice characteristics • Examples: • Bettina Braun, Institiute of Phonetics
Remaining questions.... • How to build a “natural”, but controllable environment for collecting cooperative operator data • First perform a map-task to investigate cooperation in human-human dialogue? (changed roles?) Bettina Braun, Institiute of Phonetics