1 / 16

eNTERFACE 08 Project #1 “ MultiParty Communication with a Tour Guide ECA” Final presentation

eNTERFACE 08 Project #1 “ MultiParty Communication with a Tour Guide ECA” Final presentation August 2 9th, 2008. Project Overview Objectives, Issues & Work Done System Overview Configuration and Design Conclusion. Outline.

shelly
Download Presentation

eNTERFACE 08 Project #1 “ MultiParty Communication with a Tour Guide ECA” Final presentation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. eNTERFACE 08Project #1 “MultiParty Communication with a Tour Guide ECA” Final presentation August 29th, 2008

  2. Project Overview • Objectives, Issues & Work Done • System Overview • Configuration and Design • Conclusion Outline

  3. Main objective: develop an ECA Tour Guide system which can interract with one or two users • Research features: • multiparty dialogue model and scenario between two humans and ECA • handling and combining input data: users presence and behaviors (speech, tracking) • gaze behaviors control and nonverbal model of ECA Project Objectives

  4. We implemented components which support scenario based on narration and interruptions • ECA is narrator, users can ask context-related questions (“where”, “how”, “when”) • speaker, addresse and listener identification, ECA gaze model • ECA can ask users simple “yes/no” questions to keep attention • System can detect users appearance and dynamically initiate/end session • System can detect and handle situation when users are paying less attention • System can recover from failure (e.g. SR does not recognize user’s speech) Work done: Component Functionality Overview

  5. Components are implemented • System is being integrated • debugging and full testing is needed • Not supported: • Detection of situation when users are starting their conversation • Detection of speech collision between users • Smart scheduling and control of ECAs behaviors Work done...about to be done...

  6. System Configuration

  7. Speech Recognition

  8. Functionality: • Detects users requests (“Where”, “How”, “When”, “Who”) • Detects users willingness to leave the system • Detects results of simple questioners(“yes/no”) • Detects unknown words • Implementation: • Keywords detection with confidence score and speech duration is implemented by using Loquendo API Speech Recognition

  9. Nonverbal Inputs and Understanding

  10. Functionality of components: • Detect motions and users appearance/disappearance • Detect number of users present • Detect users face orientation and increased/decreased attention • left, right user • Implementation: • OpenCV (motion) & Okao Vision (face orientation, gazing) Nonverbal Inputs: Users appearance and face orientation

  11. Decision Making Component

  12. Makes decisions “when and what to do to whom”: • Handles multimodal input events (number of users, attention, speech channels) • Handles user interruptions while ECA is speaking • Handles failures from SR component • Generates multimodal output and controls ECA’s gazing • Simple rule: “First one will be served” • “yes”/”no” questionnaire is exception • No domain knowledge and behavior scheduling Decision Making Component- Functionalities

  13. Decision Making Component component uses ideas from information state theory [Larsson’00] and AIML: • The progress of dialogue is represented by a set of variables • Most appropriate plans are selected and scheduled by simple inference • Time control to obtain both messages from speech channels in case (“yes/no”) questions • Component is being developed by using MIDIKI’s toolkit as reference Decision Making Component - Implementation

  14. Animation Player

  15. Functionality: • Animation player uses scripted behaviors (GSML language) to generate speech and animation • Model of gaze in a multiparty communication is supported: • Gazing control is obtained on the utterance level • Gaze pattern is following conversational rules (who is addresee, who is listener) • Implementation: • Visage SDK (based on MPEG-4 standard) • 3ds Max Animation Player

  16. Components to support context-based two party human - ECA communication are implemented • System is being integrated, but not fully tested • Component issues: • missing face tracking and domain knowledge about users behaviors • simple dialogue management and control (no smart scheduling and smart gaze control) • Future directions: system debugging and testing, implement tracking, improve gazing control, study on users behaviors and gazing, system evaluation Conclusion

More Related