20 likes | 338 Views
Recognition of Spatial Dynamics for Predicting Social Interaction. The Inter action Lab. Ross Mead, Amin Atrash, Maja J Matarić {rossmead, atrash, mataric}@usc.edu. Proxemics
E N D
Recognition of Spatial Dynamics for Predicting Social Interaction The Interaction Lab Ross Mead, Amin Atrash, Maja J Matarić {rossmead, atrash, mataric}@usc.edu • Proxemics • Proxemics is the study of the interpretation, manipulation, and dynamics of human spatial behavior in face-to-face social interactions [1]. • People use proxemic signals—such as distance, stance, hip and shoulder orientation, head pose, and eye gaze—to communicate an interest in initiating, accepting, maintaining, terminating, or avoiding social interaction [6]. • People also manipulate space to direct attention to an external stimulus or to guide a social partner to another location [5]. • These cues are subtle and have previously been limited to coarse analyses [7]. Objective To capture social cues signifying transitions into, during, and out of dyadic and triadic interactions. Such cues are targeted, and occur sequentially or in parallel. • An initiation cue is a behavior that attempts to engage a potential social partner in discourse (also referred to as a “sociopetal” cue) [3, 4]. • An acceptance cue signifies an acknowledgment of social presence and registration into an interacting party (e.g., accept third party into an interaction). • A termination cue proposes the end of an interaction in a socially appropriate manner (also referred to as a “sociofugal” cue) [3, 4]. • Experimental Setup • The study was conducted in a 20’ x 20’ room in USC’s Interaction Lab. • A “presenter” and a participant engaged in an interaction loosely focused on a common object of interest—a static, non-interactive humanoid robot (Bandit). • The interaction was open-ended and lasted 5-6 minutes. • Interactees were monitored by the PrimeSensor™ markerless motion capture system, an overhead color camera, and an omnidirectional microphone. • Data Set • A total of 18 participants were involved in the study. Their interactions provided the following individual and group features (from selected literature): Individual [8] stance pose hip pose torso pose shoulder pose left arm pose right arm pose Interpersonal [6] total distance straight-ahead distance lateral distance relative body orientation Proxemic [1] sex code posture code SFP axis code kinesthetic code touch code visual code thermal code olfaction code voice loudness code distance code Attentional [5] left arm deictic referent right arm deictic referent Structural [2] vis-à-vis L-shape side-by-side • Approach and Results • For proof of concept, a Hidden Markov Model (HMM) was trained to recognize each of the 6 spatial interaction cues: dyadic initiation, triadic initiation, dyadic acceptance, triadic acceptance, dyadic termination, and triadic termination. • Trained over 15 features (5 / dyad): total distance (i to j), orientation (i to j), orientation (j to i), vision code (i to j), and vision code (j to i). • The model was evaluated using leave-one-out cross-validation. • Discussion and Future Work • While these results are preliminary, the performance of the HMM suggests that there is a relationship between the feature set and the trained cues. • This initial attempt utilized a limited data set for training, considering the complexity of the domain and similarity of many of the dynamics involved. • Future work will address the accuracy of the system through further data collection, and will consider alternative recognition techniques, taking advantage of multimodal aspects of the domain. • The long-term objective of this research is to provide a social robot with the ability to recognize and produce behavior in a socially appropriate manner. • Future work will explore action selection using this information and the generation of believable social behaviors based on the models. Acknowledgments • This work is supported in part by the NSF GRFP, as well as ONR MURI grant N00014-09-1-1031, NSF grants CNS-0709296 and IIS-0803565, and the Dan Marino Foundation. We would like to thank Louis-Philippe Morency for his insight in the experimental design, and Mark Bolas and Evan Suma for their assistance in using the PrimeSensor™. Selected References • [1] Hall, E.T. 1966. The Hidden Dimension. Doubleday Co., New York, New York. • [2] Kendon, A. 1990. Conducting Interaction - Patterns of Behavior in Focused Encounters. Cambridge University Press, New York, NY. • [3] Lawson, B. 2001. Sociofugal and Sociopetal Space: The Language of Space. Architectural Press, Oxford, UK. • [4] Low, S.M. and Lawrence-Zúñiga, D. 2003. The Anthropology of Space and Place: Locating Culture. Blackwell Publishing, Oxford, UK. • [5] McNeill, D. 2005. Gesture and Thought. Chicago University Press, Chicago, Illinois. • [6] Mehrabian, A. 1972. Nonverbal Communication. Aldine Transaction, Piscataway, New Jersey. • [7] Oosterhout, T.V. and Visser, A. 2008. A visual method for robot proxemics measurements. Proceedings of Metrics for Human-Robot Interaction, Workshop at the 3rd ACM/IEEE International Conference on Human-Robot Interaction, Technical Report 471, Amsterdam, Netherlands, 61-68. • [8] Schegloff, E.A. 1998. Body torque. Social Research, vol. 65, no. 3, 535-596.