820 likes | 1.08k Views
User-Focused VUI Design. Susan L. Hura, PhD Principal, SpeechUsability SpeechTEK 2007. Agenda. Preliminaries Speech, Language & Computers 101 The Design Work Before the Design Heuristics for VUI Design Usability Testing. Speech, Language & Computers 101. The Speech Chain.
E N D
User-FocusedVUI Design Susan L. Hura, PhD Principal, SpeechUsability SpeechTEK 2007
Agenda • Preliminaries • Speech, Language & Computers 101 • The Design Work Before the Design • Heuristics for VUI Design • Usability Testing User-Focused VUI Design
The Speech Chain “What’s my balance?” “She wants her balance.” User-Focused VUI Design
Language • “An over-learned behavior” • Exposure even before birth • Continual and immersive • Unconscious rules • And therefore unconscious expectations… • Unless those expectations are violated • A socio-cultural and linguistic phenomenon User-Focused VUI Design
Rules of Conversation • Mike: “Could you pass me the salt?” • Sally: “Yes.” (does nothing) • Mike’s intent: politely requesting that Sally pass him the salt • Sally’s expected response: pass the salt • Sally’s actual response is uncooperative, but logically appropriate User-Focused VUI Design
Rules of Conversation • In VUI design, we often make people give uncooperative responses and require that they do not advance the conversation User-Focused VUI Design
Sound is the Medium User-Focused VUI Design
Not Beads on a String…. “balance…” But inter-connected puzzle pieces b æ l æ User-Focused VUI Design
Variance Everywhere The point of all of this: speech recognition is hard! • Meaning “right” = “right” = “right” • Pronunciation [raIt] [ræt] [roIt] • Acoustics User-Focused VUI Design
How Do We Know He Said “Right”? • Top-down knowledge • Context • Previous experience • Real world knowledge • Complex mappings from sound to meaning User-Focused VUI Design
How Does the ASR Engine Know He Said “Right”? • It doesn’t. • All the engine “knows” is statistics • Effects of training: labeling & transcription [raIt] = right, [ræt] = right, [roIt] = right, etc. • Acoustics “right” “right” “right” User-Focused VUI Design
What Does This Mean for VUI Design? • You are better at speech recognition than any ASR engine • Remember, for the computer, it’s all acoustics! • You can still fool a recognizer • So plan on recognition failures • Grammars matter • Set of words or phrases you expect to recognize User-Focused VUI Design
Caller speaks an utterance Capture & Digitization Spectral Representation Phonetic Classification Segmentation Search & Match Phoneme Prob. ao .92 b .22 ae .43 eh .32 aw .51 Lexical Phonetic Network Network Sound Segment "n-best" list Acoustic Models Vocab & Grammar Phonetic Recognition Process User-Focused VUI Design
How Speech Recognition Works • Caller responds to a prompt, e.g., “account balance” • Speech is detected • Through a process called endpointing • The sound is captured, digitized, and pre-processed in a variety of ways • Echo cancellation • Background noise reduction • The resulting “clean” speech signal undergoes spectral analysis (to produce a spectrogram) • The spectrogram can then be divided into acoustically distinct segments User-Focused VUI Design
How Speech Recognition Works • Acoustic segments are then compared against the set of acoustic models being used • Not a simple one-to-one mapping • There are different acoustic models for each language supported by the recognizer • Even the “same” sound will be different in different languages • Acoustic models are influenced heavily by training data • Acoustic models are phonetically-based • Phonemes • Features • Diphones, triphones User-Focused VUI Design
How Speech Recognition Works • A number of different possible “phonetic paths” are calculated • Paths are compared with items in the grammar • Grammars contain words the user might say • Result: • N-best list: list of possible user utterances • Confidence scores: statistical likelihood for each • Likelihood based on the closeness of the match between the incoming signal and the stored representations User-Focused VUI Design
What We Do with the Results • Use confidence scores to determine what the application should do next • Upper confidence limit: point at which the application assumes correct recognition and proceeds to the next dialog state • Correct recognition of in-grammar utterance • False acceptance of an in-grammar utterance • False acceptance of an out-of-grammar utterance • Lower confidence limit: point at which the application assumes that no recognition is possible for this input and moves into error handling • Correct rejection of out-of-grammar utterance • False rejection of an in-grammar utterance User-Focused VUI Design
What We Do with the Results • The tricky case: when the top item in the n-best list has a confidence score between the confidence intervals • Historically recommended course of action is confirmation • “I think you said ‘transaction history.’ Is that correct? Please say yes or no.” • Not always necessary or smart User-Focused VUI Design
Then Repeat Many Times • The same process occurs every time we recognize speech • At every dialog state--every time the recognizer is listening for a response following a prompt • For all universal commands, whenever they’re spoken • If the user has “fooled” the recognizer, the results are comical or annoying or disastrous • Side speech • Non-verbal mouth sounds • Background noise User-Focused VUI Design
Before the Design • Information-gathering is vital, but often neglected in project plans • A list of functions to be automated is not the sole goal of requirements gathering! • Designers alone carry all this information and must rely on multiple sources and techniques to find it User-Focused VUI Design
The Role of VUI in Speech Projects • A three-way intersection Business Goals VUI User Goals Technological Constraints User-Focused VUI Design
Design as Therapy • Goal of any design is creative synthesis from disparate goals, making best use of the technology • Possible only with information from all three points of the triangle • Designer must act as translator and message bearer to project sponsors User-Focused VUI Design
Project Overview Information gathering phase Design based on information; design feeds testing Testing puts designs in front of real users and exposes issues User-Focused VUI Design
Stakeholder interviews What is the business trying to accomplish? What is most important? Overall customer contact strategy? How will we know if we succeed? Ideals in Information-Gathering User interviews • What are they trying to accomplish? • What’s the context of use? • Urgency? Critical? Private? • Other points of contact? • Overall technology comfort level Technology plan • Abilities, limitations? • What user data do we have access to and when? • What data do we need to collect from users and when? • CTI: what data can we pass to agents? User-Focused VUI Design
Stakeholders Initial design presentation as a method to flush out disagreements, educate sponsors about technical limitations, and prevent misunder-standings Guerilla Tactics Users • Look for surrogate users • Role play • Investigate other customer contact: website, commercials, print ads, current IVR, etc. Technology • Make friends with a developer or telephony manager • Talk to call center staff User-Focused VUI Design
Design Strategy • The logical next step after requirements gathering and analysis • Definition of the “sound and feel” • All the elements that contribute to the overall user experience • The answer to the question “I have this requirements data, so what do I do with it?” • First opportunity for testing User-Focused VUI Design
Design Strategy as a VUI Style Guide • Design Strategy establishes a set of standards against which the VUI design team can measure every design decision • Give examples • Wording • Functionality User-Focused VUI Design
Strategy as a Communication Tool • Allows the VUI designer to.... • Re-explain the design process • Reenforce that the time and effort spent on requirements was worthwhile • Show that each prompt is an instantiation of an overall strategy, so making any change can have repercussions • And thus may help to curb client’s tendency to be an ‘armchair quarterback’ and modify prompts because it sounds better to them • Design Strategy is often very impressive to clients • They are often not expecting it • Reenforces the expertise of the VUI design team User-Focused VUI Design
VUI Elements Defined • High level elements that need to be defined are: • Recognition Strategy • Directed dialogue versus ‘how may I help you’ open-ended dialogues • Sound and Feel • Persona: personal characteristics that are conveyed by the application • Style and flow of discourse, use of earcons, etc. • Information Architecture • Functionality and how it will be arranged, including how it is presented to users and how they navigate among functions • There are many smaller decisions for each of these high-level areas User-Focused VUI Design
Rules of Thumb • Make It Real: Be sure there is a match between the application and the real world. • It’s Not Star Trek: Clearly and consistently communicate system capabilities. • Talking & Listening: Minimize the limitations of the modality. • Give Me a Hint: Help users avoid escalating errors and recover from errors gracefully. • Warm Fuzzies: Make the caller comfortable with the technology. User-Focused VUI Design
Principle 1: Make It Real • Users bring in their experiences and terminology. • They have a mental model of the domain and their interaction. • Usable applications tap into this knowledge to give users a head start in understanding how to interact with them. User-Focused VUI Design
1.1 A Rose by Any Other Name… • Is not as easy to recognize! • Users must remember what to say. • Hunting by trial and error is frustrating and time-consuming for users. User-Focused VUI Design
1.1 A Rose by Any Other Name • Is the terminology easy for users to understand? • Are the branded terms familiar enough to be comfortable? • Does the system call things what users call them? • Is terminology consistent for all customer contact? User-Focused VUI Design
1.2 Make the Common Tasks Faster • Efficiency matters, sometimes. • The more often users perform a task, the quicker it needs to be. User-Focused VUI Design
1.2 Make the Common Tasks Faster • How often do users call the system? • Within a system, which tasks get requested repeatedly by the same user? • Which tasks are rarely used? • Where are speed and efficiency most important? User-Focused VUI Design
1.3 Getting There from Here • Dead ends are a disaster in speech applications. • Remember that users are sometimes hunting, and sometimes the system makes mistakes. User-Focused VUI Design
1.3 Getting There from Here • Is there a way for the user to back up to a known point in the application? • Has it been clearly provided to them? User-Focused VUI Design
Principle 2: It’s not Star Trek • Clearly and consistently communicate system capabilities to the user. • Interfaces need to guide users to speak predictable utterances and avoid the unconstrained conversational speech that we use talking to another person. User-Focused VUI Design
2.1 Asking the Right Questions • Speech applications need to ask questions to lead users to give the right answers. • We don’t always answer exactly the question that was asked, but the other person can generally understand what we intended. • Speech applications aren’t that clever. User-Focused VUI Design
2.1 Asking the Right Questions • Does the application ask targeted questions? • Do the questions elicit a very limited set of likely responses? • Do they lead users to provide the “right” kind of response? User-Focused VUI Design
2.2 Raising Their Expectations (for a while) • “Natural Language” is here—sort of. • Statistical language models allow speech applications to categorize free-form utterances and “understand” the user, without offering a limited menu of options. • Costly to implement. • Tend to be used only for part of the automated customer interaction. User-Focused VUI Design
2.2 Raising Their Expectations (for a while) • Are users clearly told how they should respond in different portions of the application? • Is it natural for them to do so? User-Focused VUI Design
Principle 3: Talking and Listening • Conversation between people works because we share a set of rules and assumptions about talking and listening. • These rules are largely unconscious, but when they are not followed, the conversation that results is difficult to follow and uncomfortable. User-Focused VUI Design
3.1 Talk to Me • There are unspoken rules of conversation that tell us when it’s OK to talk. • We’ve all been conditioned to follow these rules and converse politely. User-Focused VUI Design
3.1 Talk to Me • Are users given clear signals of when to talk? • Do prompts phrased as statements contain a clear indication of when it’s the user’s turn? User-Focused VUI Design
3.2 Just the Facts, Ma’am • Listening is a difficult task. • Auditory memory is limited. • The implications for readouts in a voice user interface are substantial. User-Focused VUI Design
3.2 Just the Facts, Ma’am • Are there other demands on the user’s attention? • How many items can they remember? • How familiar are the terms? • How long is each item? • How long can they remember them? User-Focused VUI Design