1 / 20

Evaluation of a multimodal Virtual Personal Assistant Glória Branco

20th International Symposium 2006. Evaluation of a multimodal Virtual Personal Assistant Glória Branco. Sophie-Antipolis, March 23, 2006. Agenda. Introduction FASiL project and consortium The Virtual Personal Assistant (VPA) Architecture Functionalities Interface

tyson
Download Presentation

Evaluation of a multimodal Virtual Personal Assistant Glória Branco

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 20th International Symposium 2006 Evaluation of a multimodal Virtual Personal AssistantGlória Branco Sophie-Antipolis, March 23, 2006

  2. Agenda • Introduction • FASiL project and consortium • The Virtual Personal Assistant (VPA) • Architecture • Functionalities • Interface • Global Evaluation Methodology • Heuristic Evaluation • User Trials • The Portuguese trials • Method • Results • Users comments • Conclusions

  3. FASiL Project • FASiL – “Flexible and Adaptive Multi-Modal Spoken Interface Language” • EU-IST funded, multimodal, multi-lingual, conversational application to e-mail management. • Objectives • “...to pilot a full multi-modal voice portal application that is 3G mobile network ready, along with tools for rapid development of new applications. FASiL targets the languages of UK English, Portuguese and Swedish… [with] intelligent, friendly adaptive multi-modal interaction.”

  4. T P Inovação generation FASiL Consortium FASiL: “Flexible and Adaptive Multi-Modal Spoken Interface Language” Inovação

  5. VPA Architecture

  6. VPA Funcionalities • Hear a summary of the Inbox. • Navigation: next, previous. • Select specific e-mails : search by State (new, old), Sender, Date, Priority and Category. • Read, compose, reply, forward and delete e-mails. • Recipient list management. • Summarisation. • Categorisation.

  7. VUI Multimodal GUI VPA Interface • Output • Voice • Avatar • Screen • PDA • Input • Voice • Keyboard • Mouse • Touch • Stylus Available in English, Swedish and Portuguese

  8. Global Evaluation “to iteratively gather information about the usability and accessibility of the system” • Set up of test environment • Task design, to cover the VPA functionalities. • Test mailbox populated with a restricted set of contacts and emails. • Heuristic Evaluation • 5 expert assessments by each language. • Experts in accessibility, usability and voice interaction. • User Tests • 20 users for accessibility only for the English version (RNIB and RNID) • 20 Swedish and English users and 12 Portuguese. • Experts in email usage.

  9. The Portuguese Trial • Laboratory environment • the graphic interface was a web-based page, simulating a mobile phone. The users used a desktop PC with Internet access to interact with the GUI and a fixed phone to convey voice to the system. • 12 native Portuguese speakers • 8 males and 4 females • from 19 to 46 years (mean 30,6 years) • 75% of the participants had high-level education and 16,7 % had mid-level education • ICT domain professionals and experienced e-mail users. • 5 typical e-mail tasks • login and browsing mailbox • search for and reply to an e-mail • search and forward • administer and manage the recipient list • finding, reply and deleting an e-mail

  10. Task results summary Spoken interaction (%) Interactions

  11. Post-test satisfaction questionnaire Frequecie

  12. Statistical analysis • Significant correlation (Spearman’s correlation coefficient) between the overall satisfaction and: • Quality of dialog:  = 0,87 • Confidence:  = 0,79 • Easy of use:  = 0,74 • Interaction control:  = 0,73 • Interaction quality (error recognition):  = 0,69 • Significant correlation (Spearman’s correlation coefficient) between the overall satisfaction (subjective) and the concept accuracy (objective value of correct responses):  = 0,85. • No differences between females and males (Mann-Whitney test) as well as between the experimented or naïve users.

  13. Users aproach • The preferred modality was speech. • Natural language, using short phrases but with complex commands. • Speech input to convey the commands and graphical interface to read the messages and to scroll quickly through the contacts list. • More intensive use of the GUI to overcome the recognition problems and slowness of the system response. • Mixed initiative dialog.

  14. Interaction Example 1 U I want replace [recipient name] by carbon copy. S Who would you like to send copy to? U (barge-in)[recipient name] S Send copy to [recipient name] U I want change the recipient list.

  15. Interaction Example 2 • U mailbox • S You have 4 e-mails • U New search. Find high priority messages from [recipient name] • S You have 1 new priority e-mail from [recipient name] • U Read it

  16. Users apreciation • The conversational and multimodal VPA concept was attractive to all users and was seen as a key enabler supporting the growing user mobile attitude. • The VPA was seen as easy to use and intuitive. The Help part of the system was almost not used. • Users did not liked excessive confirmations. • The use of the TTS Portuguese voice was well accepted by the users. • Users liked voice-in and VUI and GUI-out in a small-screen environment. • The multimodality was seen as a very good capability to overcome recognition problems encountered in the VUI.

  17. Future Use But, when asked about the future use • 58% of the users said that they would not use the system in its current form. • Main reasons: • slow response time • recognition/understanding problems.

  18. Failure? Tell me “when it’s time” to stop!

  19. NO! Lessons learned • Speed of feedback is very important. Users dislike latency latency or long periods of silence. • Improvements are needed to increase the recognition accuracy of the spoken components. • Natural language is working ... with limitations. Multimodal interfaces can overcome the weaknesses of each modality and exploit the full strengths of combined modes.

  20. Thank you! Glória Branco: gloria@ptinovacao.pt www.ptinovacao.pt

More Related