200 likes | 349 Views
20th International Symposium 2006. Evaluation of a multimodal Virtual Personal Assistant Glória Branco. Sophie-Antipolis, March 23, 2006. Agenda. Introduction FASiL project and consortium The Virtual Personal Assistant (VPA) Architecture Functionalities Interface
E N D
20th International Symposium 2006 Evaluation of a multimodal Virtual Personal AssistantGlória Branco Sophie-Antipolis, March 23, 2006
Agenda • Introduction • FASiL project and consortium • The Virtual Personal Assistant (VPA) • Architecture • Functionalities • Interface • Global Evaluation Methodology • Heuristic Evaluation • User Trials • The Portuguese trials • Method • Results • Users comments • Conclusions
FASiL Project • FASiL – “Flexible and Adaptive Multi-Modal Spoken Interface Language” • EU-IST funded, multimodal, multi-lingual, conversational application to e-mail management. • Objectives • “...to pilot a full multi-modal voice portal application that is 3G mobile network ready, along with tools for rapid development of new applications. FASiL targets the languages of UK English, Portuguese and Swedish… [with] intelligent, friendly adaptive multi-modal interaction.”
T P Inovação generation FASiL Consortium FASiL: “Flexible and Adaptive Multi-Modal Spoken Interface Language” Inovação
VPA Funcionalities • Hear a summary of the Inbox. • Navigation: next, previous. • Select specific e-mails : search by State (new, old), Sender, Date, Priority and Category. • Read, compose, reply, forward and delete e-mails. • Recipient list management. • Summarisation. • Categorisation.
VUI Multimodal GUI VPA Interface • Output • Voice • Avatar • Screen • PDA • Input • Voice • Keyboard • Mouse • Touch • Stylus Available in English, Swedish and Portuguese
Global Evaluation “to iteratively gather information about the usability and accessibility of the system” • Set up of test environment • Task design, to cover the VPA functionalities. • Test mailbox populated with a restricted set of contacts and emails. • Heuristic Evaluation • 5 expert assessments by each language. • Experts in accessibility, usability and voice interaction. • User Tests • 20 users for accessibility only for the English version (RNIB and RNID) • 20 Swedish and English users and 12 Portuguese. • Experts in email usage.
The Portuguese Trial • Laboratory environment • the graphic interface was a web-based page, simulating a mobile phone. The users used a desktop PC with Internet access to interact with the GUI and a fixed phone to convey voice to the system. • 12 native Portuguese speakers • 8 males and 4 females • from 19 to 46 years (mean 30,6 years) • 75% of the participants had high-level education and 16,7 % had mid-level education • ICT domain professionals and experienced e-mail users. • 5 typical e-mail tasks • login and browsing mailbox • search for and reply to an e-mail • search and forward • administer and manage the recipient list • finding, reply and deleting an e-mail
Task results summary Spoken interaction (%) Interactions
Post-test satisfaction questionnaire Frequecie
Statistical analysis • Significant correlation (Spearman’s correlation coefficient) between the overall satisfaction and: • Quality of dialog: = 0,87 • Confidence: = 0,79 • Easy of use: = 0,74 • Interaction control: = 0,73 • Interaction quality (error recognition): = 0,69 • Significant correlation (Spearman’s correlation coefficient) between the overall satisfaction (subjective) and the concept accuracy (objective value of correct responses): = 0,85. • No differences between females and males (Mann-Whitney test) as well as between the experimented or naïve users.
Users aproach • The preferred modality was speech. • Natural language, using short phrases but with complex commands. • Speech input to convey the commands and graphical interface to read the messages and to scroll quickly through the contacts list. • More intensive use of the GUI to overcome the recognition problems and slowness of the system response. • Mixed initiative dialog.
Interaction Example 1 U I want replace [recipient name] by carbon copy. S Who would you like to send copy to? U (barge-in)[recipient name] S Send copy to [recipient name] U I want change the recipient list.
Interaction Example 2 • U mailbox • S You have 4 e-mails • U New search. Find high priority messages from [recipient name] • S You have 1 new priority e-mail from [recipient name] • U Read it
Users apreciation • The conversational and multimodal VPA concept was attractive to all users and was seen as a key enabler supporting the growing user mobile attitude. • The VPA was seen as easy to use and intuitive. The Help part of the system was almost not used. • Users did not liked excessive confirmations. • The use of the TTS Portuguese voice was well accepted by the users. • Users liked voice-in and VUI and GUI-out in a small-screen environment. • The multimodality was seen as a very good capability to overcome recognition problems encountered in the VUI.
Future Use But, when asked about the future use • 58% of the users said that they would not use the system in its current form. • Main reasons: • slow response time • recognition/understanding problems.
Failure? Tell me “when it’s time” to stop!
NO! Lessons learned • Speed of feedback is very important. Users dislike latency latency or long periods of silence. • Improvements are needed to increase the recognition accuracy of the spoken components. • Natural language is working ... with limitations. Multimodal interfaces can overcome the weaknesses of each modality and exploit the full strengths of combined modes.
Thank you! Glória Branco: gloria@ptinovacao.pt www.ptinovacao.pt