1 / 25

HUMANOID ANIMATION DRIVEN BY HUMAN VOICE

HUMANOID ANIMATION DRIVEN BY HUMAN VOICE. Thesis Advisor : Dr. Donald P. Brutzman Second Reader : Dr. Xiaoping Yun A Thesis By Ozan APAYDIN, Turkish Navy March 2002. GOALS. Perform a background search on speech recognition technology to find a suitable component for this project,

Mercy
Download Presentation

HUMANOID ANIMATION DRIVEN BY HUMAN VOICE

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. HUMANOID ANIMATIONDRIVEN BY HUMAN VOICE Thesis Advisor : Dr. Donald P. Brutzman Second Reader : Dr. Xiaoping Yun A Thesis By Ozan APAYDIN, Turkish Navy March 2002

  2. GOALS • Perform a background search on speech recognition technology to find a suitable component for this project, • Develop a VUI (Voice User Interface) that maps between human voice commands and a set of animations of the avatar and provides access to the application, • Build a motion library to animate available humanoids, • Demonstrate interchangeability of the behaviors and the humanoids,  • Create humanoid animation driven by a human voice.

  3. INTRODUCTION MEDIUM (AIR) HUMAN VOICE VOICE RECEIVER COMPUTER ENVIRONMENT SPEECH RECOGNITION APPLICATION Rule A Rule B Rule C . . Animation X Animation Y Animation Z . . GEOMETRY RULE CHOOSER

  4. SPEECH RECOGNITION TECHNOLOGY (SRT) HISTORY – THE FIRST A toy company logged the first success story in the field of speech recognition decades before major research in the area was considered. “Radio Rex” was a celluloid dog that responded to its name. Lacking the computation power that powers recognition devices today, Radio Rex was a simple electromechanical device. The dog was held within its house by an electromagnet. As current flowed through a circuit bridge, the magnet was energized. The bridge was sensitive to 500 cps of acoustic energy. The energy of the vowel sound of the word “Rex” caused the bridge to vibrate, breaking the electrical circuit, and allowing a spring to push Rex out of his house.

  5. SRT - BASIC CONCEPTS • Grammar, • Training, • Speaker Dependence vs. Independence, • Natural Language Commands, • Accuracy.

  6. SRT – APPLICATION FEATURES • Command & Control • Dictation • Synthesizing

  7. SRT – FACTORS AFFECTING ACCURACY • Environment • Hardware • Speaker/User • Vocabulary Size • Grammar • Training

  8. SRT – LIMITATIONS • Free-form Speech Input • Mistakes • Rejection • Misrecognition • Misfire

  9. SRT POTENTIALS VUIs have their greatest potential in the following cases : • Users with various disabilities that prevent them from using a mouse/or keyboard. • All users, with or without disabilities, who are in an eyes busy, hands-busy situation. • Users who don’t have access to a keyboard and/or a monitor. For example accessing a system through a payphone.

  10. JAVA SPEECH API “The Java Speech API, developed by Sun Microsystems in cooperation with speech technology companies, defines a software interface that allows developers to take advantage of speech technology for personal and enterprise computing.”

  11. JAVA SPEECH API • Cross-Platform, Cross-Vendor • Support for Speech Synthesizers and for both Command & Control and Dictation Speech Recognizers • Integration with Other Capabilities of the Java Platform

  12. IBM VIAVOICE SDK • Implementation of Java Speech API • Provides an access to IBM ViaVoice engine • Requires IBM ViaVoice or ViaVoice Runtimes

  13. H-ANIM WORKING GROUP GOALS • Specify a way of defining interchangeable humanoids and animations • Allow people to author humanoids and animations independently

  14. H-ANIM WORKING GROUPSPECIFICATIONS • H-Anim 1.0 Specification • H-Anim 1.1 Specification • H-Anim 2001 Specification (Draft)

  15. ? MODELS

  16. MODELS

  17. INTERCHANGEABLE ACTORS Putting the avatars and their behaviors together in such a way that the final product should be: • Efficient, • Easy to expand.

  18. INTERCHANGEABLE ACTORS • Creating behavior prototypes, • Converting to X3D native tags, • Forming a switchable design for avatars, • Employing dynamic routing.

  19. INTERCHANGEABLE ACTORS

  20. SYSTEM INFRASTRUCTURE VIAVOICE ENGINE VIAVOICE SDK (JAVA SPEECH API IMPLEMENTATION) BROWSER INVOKER CLIENT RECOGNIZER AND SERVER VRML SCENE ORDER EXECUTOR AND CLIENT

  21. FINAL PRODUCT • Hybrid (VUI + GUI), • Networked (UDP/IP), • User-Independent, • Mono-Lingual, • Multi-Platform.

  22. FINAL PRODUCT

  23. DEMO

  24. CONCLUSIONS • Speech Recognition Technology (SRT) can be integrated into Virtual Environments (VEs). • Hybrid (VUI + GUI) applications can be very powerful. • Humanoids and animation behaviors can be designed interchangeably.

  25. FUTURE WORK • Simulation of a scenario or a game, • Improving networking, • Expanding motion library, • Combination of animation behaviors. For example : Walk & Jump • Thesis Follower : Ekrem SERIN

More Related