200 likes | 334 Views
Beyond Usability: Measuring Speech Application Success. Silke Witt-Ehsani, PhD VP, VUI Design Center TuVox. Outline. What is Success? Success Criteria Success Metrics Putting it all together: A health check methodology. Success vs Design How they effect each other Case studies.
E N D
Beyond Usability: Measuring Speech Application Success Silke Witt-Ehsani, PhD VP, VUI Design Center TuVox
Outline • What is Success? • Success Criteria • Success Metrics • Putting it all together: A health check methodology • Success vs Design • How they effect each other • Case studies
Different questions (Success Criteria) require different answers (Success Metrics) How do we do that? Success Criteria: i.e. What is “success”? • Common criteria: • Are callers transferred to the correct destination? • How many callers are being helped? • How do callers like my speech applications? • What is the system recognition accuracy?
Subjective Usability study Whole call recordings Individual caller feedback Objective = Application Statistics Automation rates Containment rates Non-cooperative caller rate Success Metrics: Subjective vs Objective
Success Metrics: Business vs Technical Higher Routing Accuracy = Less Agent-to-agent transfers More Transfers out of application = higher call center cost • Business Metrics for • Business User: • Routing Accuracy • Agent Transfers • Customer Satisfaction • Technical Users: • need detailed application performance on dialog state level • grammar coverage • NoMatch, NoInput • need ability to drill down Business stakeholders care about the bottom line impact of several application and speech events
Common Business Metrics • Containment rate = “keep caller hostage in the system” • Automation rate = “offer complete functionality…” • Successful routing = “get the caller to the right expert” • Average call duration • And many, many more ….
Application Health Check - Business 3 main elements of a Business Health Check are • Custom defined success rate • Non co-operative Caller rate • Agent Transfer rate • Transfer due to explicit caller request • Transfer due to errors (both speech and system) • Transfer by design (i.e. correctly routed calls)
Example Success Metric: Routing Accuracy Definition: Confirmed routed calls (calls reaching an end destination) over all calls Useful metric when using: • Skills-based routing • Routing application with N routing points % Routing Accuracy 85% 77% 68.3% ~150 routing points ~ 50 routing points 4 routing points
% Non-cooperative Callers 8.6% 6.3% Open-ended Router Directed Dialog Technical Support Example: Non Co-operative Callers Definition: Non-cooperative callers is the percentage of all callers that immediately hang-up or request an agent but never interact with the application Possible reasons: • Degree of caller acceptance of system • Non application related, such as wrong number, child crying etc. Expected range: 5-10% of call volume
% Agent Requests 45% 4.7% Example: Agent Transfers • Applications tend to have many different types of agent transfers. • Main categories: • Customer zero-ing out • Routing to an agent based on caller information is a “Designed Transfer” • Routing due to some logic in the application is a “Necessary Transfer” • Agent Transfers have immediately impact on call center cost Definition: % Agent transfers of all calls Example from a Telecommunications Company
Numbers are relative, they only have meaning in a context When defining success metrics, create a baseline then compare to that. Potential Baselines: previous IVR touch-tone application Go-live Performance Baseline and Trending Customers finding speech easier or much easier than IVR 76% 66% 52% Usability Go-live Tuning 1
Application Health check = Technical • Purpose of hotspot analysis • Identify areas where application is performing sub-optimal • Hotspot analysis should be done for each dialog state • Important: Hotspot analysis gives the “where” of issues, not the “why”!
Rule of Thumb : State Exit Count = # of calls * ( %H + %NI + %NM + %TR) Framework for Technical Health Check TuVox Hotspot analysis = Integrated view of: • Hang-up ( %H ) • % Final NoInput ( %NI) • % Final NoMatch ( %NM) • Transfer Requests ( %TR ) These numbers are a first order of approximation: • Sort by highest state exit count • Review one by one in context, i.e. high hang-up because it is a logical end point
Design influences success Authentication Look up all loans for this callers Does caller Caller selects from have more than yes list of loans 1 loan ? no Loan Menu : · Balance Does caller has · no More loan details a line of credit ? · Make loan payment yes Design Success and Design are tightly linked Success determines the design Success Metric
Case Study 1: Airline application • Customer requirement: 64% Success • Success definition: • “For 64% of the callers entering the application, their ticket reservation record has to be retrieved from the back-end • Design consequences: • Ensure via prompting that callers have their record identifier number before entering the application • Make it hard to get to an agent, i.e. multiple retries • Explain what the record identifier was Design tailored to success criteria but at the expense of ease of use and caller experience
Case Study 2: Travel Application Hotspot analysis identifies a too high number of exists at a main menu • Observation: One menu option is much more common than other 5 choices • Old Design: Menu with 6 options • New Design: Yes/no question followed by a menu Impact on Application Performance • Turn failure rate = Decreased by 39% • Opt-out rate to the call center = Decreased by 44%
Case Study 3: HighTech Routing Application • 3 success criteria: • Average call handling less than 30 secs • High customer satisfaction • 4 queues to route to, but many different call reasons • Influence of these criteria on the design: • Only 1 reprompt instead to standard 2 attempts • No traditional error prompting a la ‘sorry I didn’t get that’ • Natural language open ended prompting with high coverage grammar
Summary • Define Application Success Criteria • Based on that, define success metrics • Use trending and baseline to put data in context • Success Criteria and Design are highly interlinked, i.e. success criteria determine the design • The design influences how targeted success metrics can be met