220 likes | 325 Views
A Speech Interface to Virtual Environment. Authors Scott McGlashan and Tomas Axling Swedish Institute of Computer Science. Presentation Agenda. Introduction The TALKING AGENT system DIVE SR/TTS Agent Modeling Framework Interaction Metaphor Reference Resolution Future Work Conclusion.
E N D
A Speech Interface to Virtual Environment Authors Scott McGlashan and Tomas Axling Swedish Institute of Computer Science
Presentation Agenda • Introduction • The TALKING AGENT system • DIVE • SR/TTS • Agent Modeling Framework • Interaction Metaphor • Reference Resolution • Future Work • Conclusion
Purposes of this paper • Analyze the technical and design issues to combine a virtual world with a speech interface. • Describe system architecture of the TALKING AGENT system.
Problems of Integration • Speech Recognition : Limited vocabulary to gain accuracy. • Language Understanding : Limited knowledge to maximize the understanding. • Interaction Metaphor : Who does the user talk to? (Above questions are discussed in detail in the authors’ last paper “Speech Interface to Virtual Reality”.)
Innovation of this System • Combining intelligent agent and speech interface to carry out specialized functions in the VR World. • Functions have been implemented : • Transporting objects • Fetching objects • Painting objects • Increasing the size of objects
DIVE-Virtual Reality System • DIVE(Distribute Interactive Virtual Environment) is a multi-user virtual environment. • DIVE allow users and environment interact in real-time. • DIVE contains a database composed of hierarchically organized objects .
Speech Recognition • SR with limited pre-defined phrases promises good recognition performance. • Using grammar to set constraint to search space. • Using commercial SR-engine (Nuance).
Agent Modeling Framework • High-level languages do not support complex symbolic computations. • Oz is well suited for this purpose. • Using ODI as interface between Oz and DIVE. • The parent agent consists basic functions. • We can define more specific agent by extend parent agent.
Interaction Metaphor • Direct manipulation -Personal Presence. • Various metaphors for spoken interaction have been proposed. • Proxy • Divinity • Telekinesis • Interface Agent • This system adopt the Proxy metaphor.
Addressing Agent • Inside the user’s eye-sight • Dialogue initiated by clicking on the agent. • Outside the user’s eye-sight • Phone agent-First press the phone agent then connect to remote agent
Feedback • Given speech input ,system should give the visual feedback to the user. • If the agent listening or not? • What is the feedback when talking to agent far away?
Reference Resolution • Given some descriptions , the reference resolution engine maps them to object which user is referring to. • Considerations • Object focus. • Property Perception. • Discourse Modeling.
Robust Interaction • When errors don’t matter • User can view the results and current them by direct manipulation. • Safety-critical applications • Confirm user command. • Clarifying incomplete or ambiguous commands.
Future Work • Agent behavior should related to its previous action . • Add mental components. • Talking to agent by aura-driven . • Evaluate this system with realistic scenario. • Ex: virtual travel agency.
Conclusions • Add a speech interface to VR-system. • Using constraint SR to achieve high accuracy. • Developing an appropriate metaphor. • The agents modeled in this system provide specific functions in the virtual world.
Paper Source McGlashan, S Speech Interfaces to Virtual Reality in Proceedings of the Second Conference on the Military Applications of Synthetic Environments and Virtual Reality, Stockholm, Sweden, 1995.