170 likes | 198 Views
Explore the BEAT toolkit designed at Yonsei University for animating human-like body movements and voice intonation using text inputs. The system analyzes linguistic and contextual information to control hand, arm, face movements, and voice intonation synchronized with speech. Learn about conversational behavior and the technology's extensibility. BEAT's system architecture leverages knowledge bases, language tagging, behavior suggestion, selection, scheduling, and animation modules for a comprehensive toolkit. Witness how the toolkit enables real-time synchronization between speech and nonverbal behaviors, revolutionizing character animation techniques.
E N D
Chapter 7.BEAT: the Behavior Expression Animation Toolkit Soft computing Laboratory Yonsei University October27, 2004
Outline • Introduction • Conversational behavior • Related work • System • Knowledge base • Language tagging • Behavior suggestion • Behavior selection • Behavior scheduling and animation • Extensibility • Example animation • Conclusion
Introduction • Association between speech and other communicative behaviors • Poses particular challenges to procedural character animation techniques • Voice is called for • Issues of synchronization and appropriateness render disfluent otherwise more than adequate techniques • BEAT • Allows one to animate a human-like body using just text as input • Uses linguistic and contextual information contained in the text • Control the movement of the hands, arms and face, and the intonation of the voice
Conversational behavior • To communicate with one another • Not only use words but also intonation, hand gestures, facial displays, eye gaze, head movements and body posture • Co-occurrence of behaviors is almost equally important • Communicative intention and the timing of all of them are based on the most essential communicative activity • When people tried to tell a story without words, their gestures demonstrated entirely different shape and meaning characteristics as compared to when the gestures accompanied speech
Related work • Until the mid-1980s or so • Manually enter the phonetic script that would result in lip-synhing of a facial model to speech • Today • Automatically extract visemes from typed text in order to synchronize lip shapes to synthesized or recorded speech • There have been a smaller number of attempts to synthesize human behaviors specifically in the context of communicative acts • Synthesis of animated communicative behavior started from • Underlying computation-heavy intention to communicate computational heavy • Set of natural language instructions do not guide its speech • State machine specifying whether or not the avatar or human participant was speaking
System BEAT • Goal • Input is a typed script • Output is automatically produced appropriate nonverbal behavior synchronized with speech • Approach • Analyze the text for certain linguistic features • Generate nonverbal behavior based on those features, knowledge bases and research into human conversational behavior • Compile the behaviors and schedule them to be animated in synchrony with speech
System BEAT system architecture
System XML trees passed among modules
System Knowledge base • Knowledge base • Adds some basic knowledge about the world to what we can understand from the text itself • Allows us • draw inferences from the typed text • Specify the kinds of gestures that should illustrate it and the kinds of places where emphasis should be created • Common gestures include • Beat, deictic and contrast gesture • Gestures are added to the database by the animator
System Language tagging • Language module • Responsible for annotating input text with the linguistic and contextual information • Allows successful nonverbal behavior assignment and scheduling • Automatically recognizes and tags units in the text typed by the user • Language tags • Clause • Theme and rheme • Word newness • Contrast • Objects and actions
System Behavior suggestion • Behavior suggestion module • Operates on the XML trees produced by the Language Tagging module • Behavior suggestions are specified with • Tree node, priority, required animation degrees-of-freedom, and any specific information needed to render them • Current set of behavior generators implemented in the toolkit • Beat gesture generator • Surprising feature iconic gesture generator • Action iconic gesture generator • Contrast gesture generator • Eyebrow flash generator • Gaze generator • Intonation generator
System Behavior selection • Behavior selection module • Analyzes the tree that contains many, potentially incompatible, gesture suggestions • Reduces these suggestions down to the set that will actually be used in the animation • Conflict resolution filter • Detects all nonverbal behavior suggestion conflicts • Resolves the conflicts by deleting the suggestions with lower priorities • Priority threshold filter • Removes all behavior suggestions whose priority falls below a user-specified threshold
System Behavior scheduling and animation • Two ways to achieve synchronization between character animation subsystem and a subsystem for producing the character’s speech • To obtain estimates of word and phoneme timings and construct an animation schedule prior to execution • To assume the availability of real-time events from a TTS engine and compile a set of event-triggered rules to govern the generation of the nonverbal behavior
System Extensibility • Collaborating with Alias/Wavefront to integrate • BEAT with Maya • Designed to be extensible in several significant ways • New entries can easily be made in the knowledge base to add new hand gestures to correspond to domain object features and actions • The range of nonverbal behaviors and the strategies for generating them can easily be modified by defining new behavior suggestion generators • Entire modules can be easily re-implemented simply by adhering to the XML interfaces
Example Animation First example • “You just have to type in some text, and the actor is able to talk and gesture by itself”
Example Animation Second example • “I don’t know if this is a good thing or a bad thing”
Conclusion • BEAT • Flexible platform for procedural character animation of nonverbal conversational behaviors synchronized with speech • It's Not What You Say, But How You Say It • Future work • More complete coverage of conversational behavior • Extending to multiple characters • Extending to additional animation systems • Speed