1 / 18

Towards a Reactive Virtual Trainer

This paper discusses the usage, technological challenges, and architecture of a Reactive Virtual Trainer (RVT) that integrates reactive and proactive actions in a multi-modal sync. It explores the application of RVT as a medium and empathetic consultant in various scenarios, such as preventing RSI, preserving/restoring physical condition, and acting as a physiotherapist. The relevance of RVT for society is highlighted in the context of an aging population and unhealthy lifestyle. The paper also examines the challenges of multi-modal synchronization and planning in real-time.

aamaro
Download Presentation

Towards a Reactive Virtual Trainer

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Towards a Reactive Virtual Trainer Zsófia Ruttkay, Job Zwiers, Herwin van Welbergen, Dennis Reidsma HMI, Dept. of CS, University of Twente Amsterdam, The Netherlands zsofi@cs.utwente.nl

  2. Overview • RVT usage • Related work • RVT technological challenges • Architecture • Integration of reactive and proactive actions • Multi-modal sync • A close look at clapping - demos

  3. RVT usage • RVT = IVA with expert and psychological knowledge of a real physiotherapist, to be used e. g. to: • prevent RSI for computer workers • preserve/restore weight and physical condition as (personal) trainer • act as physiotherapist to cure illnesses affecting motion • RVT is medium and emphatic consultant • Relevance for society • ageing population, unhealthy life-style, • human experts: low number, expensive, at certain locations • RVT usage context • PC + 1-2 camera in normal setting (homes, offices) • ‘instructed’ by authorized person (may be the user, as well as developer) • can be adapted/extended

  4. Related work

  5. Own related work – Virtual Rap Dancer

  6. Own related work – Virtual Conductor

  7. RVT technological challenges • Vision-based perception, may be extended with biosignals • Reactive on exercise performance, physical state, overall performance • Smalltalk, exercise correction, plan revision • VRT body and motion parameters adaptable/calibrated • Authoring by human • Extensible by expert (new exercises) • Motion with music, speech or clapping (also as input for tempo) • Playground for multi-modal output generation • “Exercise motion intelligence”: timing, concatenation, idle poses, …

  8. Human expert Monitoring the user • Planning action of VT Presentation of feedback of VT RVT architecture Calibration of user Authoring scenario Multi-modal feedback Exercise sce-nario revision Motion interpretation Multi-sensor integration Motion specification Motion demonstration Interfaces Biosensing module(s) Optical motion tracking Acoustic beat tracking VT User

  9. Multi-modal sync • Exercises are executed using several modalities • Body movement • Speech • Music • Sound (clap, foot tap) • Challenges • Synchronization • Monitoring user => real time (re)planning • Exaggeration to point out details • Speed up / slow down • Feedback/correction • …

  10. Synchronization: related work • Classic approach in speech/gesture synchronization: • Speech leads, gesture follows • MURML (Kopp et al.) • No leading modality • Planning in sequential chunks containing one piece of speech and one aligned gesture • Co-articulation at the border of chunks • BML (Kopp, Krenn, Marsella, Marshall, Pelachaud, Pirker, Thórisson, Vilhjalmsson) • No leading modality • Synchronized alignment points in behavior phases • For now, aimed mainly at speech/gesture synchronization • In development

  11. Synchronization: own previous work • Virtual Dancer • Synchronization between music (beats) and dance animation • Dance move selection by user interaction • Virtual Presenter • Synchronization between speech, gesture, posture and sheet display • Leading modality can change over time • GESTYLE markup language with par/seq and wait constructs

  12. Close look at clapping stroke (hold) retraction (hold)

  13. Clapping Exercise

  14. Close look at clapping • Start with a simple clap exercise and see what we run into • The clap exercise: • Clap for the tempo of the beat of a metronome (later: of music) • When the palms touch, a clap sound is heard • Count while clapping, using speech synthesis • Possible alignment at: word start/end, phonological peak start/center/end • For now, we pick the center of the phonological peak, but we do generate the other alignment points for easy adaptation

  15. Two examples for multi-modal sync • Specification in BMLT • Planning in real-time – under/overspecification!

  16. What if we speed up the tempo? • The clapping animation should be faster • Possibilities: • Lower amplitude? • Linear speedup? • Speedup of stroke? • Speedup of retraction? • A combination of above?

  17. What if we slow down the metronome? • Slower clapping? (movies here) • Linear slowdown? • Slowdown of stroke? • Slowdown of retraction? • Hold at end of retraction (hands open)? • Hold after stroke (clap)? • A combination of above? • Back to idle position?

  18. Open issues on planning • What do real humans do? • Do the semantics of a motion (clap) change if we change its amplitude or velocity profile? E.g. emotions, individual features • Smooth tempo changes • Automatic concatenation and inserted idle poses • Appropriate high-level parameters • Related (e.g. amplitude/speed)? • Different of parameters for communicative gestures (e.g. by Pelachaud)? • Amplitude and motion path specification • Is our synchronization system capable to re-plan in real time?

More Related