Towards a Reactive Virtual Trainer

Towards a Reactive Virtual Trainer Zsófia Ruttkay, Job Zwiers, Herwin van Welbergen, Dennis Reidsma HMI, Dept. of CS, University of Twente Amsterdam, The Netherlands zsofi@cs.utwente.nl

Overview • RVT usage • Related work • RVT technological challenges • Architecture • Integration of reactive and proactive actions • Multi-modal sync • A close look at clapping - demos

RVT usage • RVT = IVA with expert and psychological knowledge of a real physiotherapist, to be used e. g. to: • prevent RSI for computer workers • preserve/restore weight and physical condition as (personal) trainer • act as physiotherapist to cure illnesses affecting motion • RVT is medium and emphatic consultant • Relevance for society • ageing population, unhealthy life-style, • human experts: low number, expensive, at certain locations • RVT usage context • PC + 1-2 camera in normal setting (homes, offices) • ‘instructed’ by authorized person (may be the user, as well as developer) • can be adapted/extended

Related work

Own related work – Virtual Rap Dancer

Own related work – Virtual Conductor

RVT technological challenges • Vision-based perception, may be extended with biosignals • Reactive on exercise performance, physical state, overall performance • Smalltalk, exercise correction, plan revision • VRT body and motion parameters adaptable/calibrated • Authoring by human • Extensible by expert (new exercises) • Motion with music, speech or clapping (also as input for tempo) • Playground for multi-modal output generation • “Exercise motion intelligence”: timing, concatenation, idle poses, …

Human expert Monitoring the user • Planning action of VT Presentation of feedback of VT RVT architecture Calibration of user Authoring scenario Multi-modal feedback Exercise sce-nario revision Motion interpretation Multi-sensor integration Motion specification Motion demonstration Interfaces Biosensing module(s) Optical motion tracking Acoustic beat tracking VT User

Multi-modal sync • Exercises are executed using several modalities • Body movement • Speech • Music • Sound (clap, foot tap) • Challenges • Synchronization • Monitoring user => real time (re)planning • Exaggeration to point out details • Speed up / slow down • Feedback/correction • …

Synchronization: related work • Classic approach in speech/gesture synchronization: • Speech leads, gesture follows • MURML (Kopp et al.) • No leading modality • Planning in sequential chunks containing one piece of speech and one aligned gesture • Co-articulation at the border of chunks • BML (Kopp, Krenn, Marsella, Marshall, Pelachaud, Pirker, Thórisson, Vilhjalmsson) • No leading modality • Synchronized alignment points in behavior phases • For now, aimed mainly at speech/gesture synchronization • In development

Synchronization: own previous work • Virtual Dancer • Synchronization between music (beats) and dance animation • Dance move selection by user interaction • Virtual Presenter • Synchronization between speech, gesture, posture and sheet display • Leading modality can change over time • GESTYLE markup language with par/seq and wait constructs

Close look at clapping stroke (hold) retraction (hold)

Clapping Exercise

Close look at clapping • Start with a simple clap exercise and see what we run into • The clap exercise: • Clap for the tempo of the beat of a metronome (later: of music) • When the palms touch, a clap sound is heard • Count while clapping, using speech synthesis • Possible alignment at: word start/end, phonological peak start/center/end • For now, we pick the center of the phonological peak, but we do generate the other alignment points for easy adaptation

Two examples for multi-modal sync • Specification in BMLT • Planning in real-time – under/overspecification!

What if we speed up the tempo? • The clapping animation should be faster • Possibilities: • Lower amplitude? • Linear speedup? • Speedup of stroke? • Speedup of retraction? • A combination of above?

What if we slow down the metronome? • Slower clapping? (movies here) • Linear slowdown? • Slowdown of stroke? • Slowdown of retraction? • Hold at end of retraction (hands open)? • Hold after stroke (clap)? • A combination of above? • Back to idle position?

Open issues on planning • What do real humans do? • Do the semantics of a motion (clap) change if we change its amplitude or velocity profile? E.g. emotions, individual features • Smooth tempo changes • Automatic concatenation and inserted idle poses • Appropriate high-level parameters • Related (e.g. amplitude/speed)? • Different of parameters for communicative gestures (e.g. by Pelachaud)? • Amplitude and motion path specification • Is our synchronization system capable to re-plan in real time?

Towards a Reactive Virtual Trainer

Towards a Reactive Virtual Trainer

Presentation Transcript

TOWARDS A VIRTUAL REALITY PROTOTYPE FOR FUEL CELLS

Towards a Virtual Institute for Research into eGovernment

Towards a Virtual Observatory

Virtual NanoFab A Silicon NanoFabrication Trainer

Towards a ‘Virtual Placement’

Virtual Heritage Experiences: Towards A Typology

Designing a Reactive Implementation

A Collaborative e-Science Architecture towards a Virtual Research Environment

Towards Programmable Virtual Networks

Towards Virtual Networks for Virtual Machine Grid Computing

Perception, Cognition for Virtual Trainer

Towards the virtual organism

DVM: Towards a Datacenter-Scale Virtual Machine

Towards a Virtual Campus in Switzerland

Progress Towards Petascale Virtual Machines

Towards a virtual agent using similarity-based laughter production

Recent developments towards a Solar System Virtual Observatory

Reactive

Towards Programmable Virtual Networks

Aviation Virtual Procedural Trainer

Laparoscopic Virtual Endo Trainer

Towards Programmable Virtual Networks