100 likes | 279 Views
VUI Evaluation. Paradise Framework. PARADISE Paradigm for Dialogue System Evaluation. Goal: Maximize User Satisfaction. PARADISE Paradigm for Dialogue System Evaluation.
E N D
VUI Evaluation Paradise Framework
PARADISE Paradigm for Dialogue System Evaluation • Goal: Maximize User Satisfaction
PARADISE Paradigm for Dialogue System Evaluation • Performance is modeled as a weighted function of a task-based success measure and dialogue-based cost measures, where weights are computed by correlating user satisfaction with performance. • Dialogue tasks are represented as Attribute Value Matrix (AVM) pairs.
PARADISE Paradigm for Dialogue System Evaluation • Advantages • PARADISE approach addresses performance and user satisfaction • Disadvantages • Too complex to compute. • Need a large sample size up front.
Alternative Approaches • What’s important? • Maximize User Satisfaction • Maximize Task Success
User Satisfaction • How do we measure user satisfaction? • Questionnaires • Interviews • Focus Groups
Task Success • How do we measure task success? • Logging Actual Use • Performance Measurement • Walkthroughs • Pilot Testing
Task Success • For each dialogue and the entire conversation establish AVMs. • Measure task success with respect to: • Task completion time • Accuracy or Errors (e.g. misinterpretations)
Conclusions • PARADISE is good, but too complex! • Measure user satisfaction and task success. • Develop a formula that considers Task Completion Time, Accuracy/Errors and User Satisfaction