1 / 23

Recognition of meeting actions using information obtained from different modalities

This study explores the social psychology aspect of joint activities in meetings and proposes a semantic approach to modeling meeting actions using information from different modalities. The lexicon of meeting actions is defined and other aspects of meetings, such as user profiles and background knowledge, are taken into consideration. The goal is to improve the recognition and understanding of meeting actions in order to enhance meeting outcomes.

Download Presentation

Recognition of meeting actions using information obtained from different modalities

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Recognition of meeting actions using information obtained from different modalities Natasa Jovanovic TKI University of Twente

  2. Outline • Social psychology aspect of joint activities, joint and individual actions • Meeting as a sequence of meeting actions • Semantic approach in modeling meetings • Lexicon of meeting actions • Other aspects of meetings • Semantic model • Conclusions and future directions

  3. Joint activities (Social psychology aspect) • Activity types: time-bounded event (football game) or an ongoing process (teaching) • Joint activity- an activity with more than one participant. • Discourse ( language has dominate role), football game, weeding ceremony, meeting • Dimensions of joint activities: formality, scriptedness, verbalness, cooperativness • Aspects of joint activities: participants, activity roles, public goals, private goals, hierarchies, boundaries, dynamics etc. • Joint activity advance through joint actions

  4. Individual and joint actions(Social psychology aspect) • Joint action – a group of people doing things in coordination ( e.g speaking and listening,passing a ball in basketball etc.). • Coordination of both content and processes • Individual actions: • Autonomous actions • Participatory actions (individual acts performed only as the part of a joint action) • A person’s processes may be very different in individual and joint actions even when they appear identical • In joint actions participants often perform different individual actions

  5. Meeting as a sequence of meeting actions (I) • Meeting is a dynamic process which consists of group interaction ( joint actions) between meeting participants -meeting actions (meeting events) • Meeting actions:monologue, discussion, note taking, presentation, consensus, disagreement etc. • Meeting actions are determined by the participants’ individual actions Beh=f(P,E) P-person; E-environment

  6. Meeting as a sequence of meeting actions(II) • Multimodal human-human interaction in the meeting (natural humans behavior) • Communication channels: speech, face expressions, gestures, body movements, gaze etc. • Combination of verbal and non-verbal elements

  7. Semantic approach in modeling meeting (I) • Our idea: Semantic approach in modeling meeting as a sequence of meeting actions using information obtained from different modalities • Why do we need a semantic approach?

  8. Semantic approach in modeling meeting(II) • Multidimensional (multilevel) problem in meeting modeling. • participant level : integration of information obtained from different modalities in order to recognize multimodal participants behavior • meeting action level:recognition of meeting actions as a combination of the multimodal participants behavior

  9. Lexicon of meeting actions(I) • The first step in meeting modeling is to describe a lexicon of meeting actions • Each meeting action has something like a micro grammar • Structure of lexicon: • definition of a meeting action • characteristics: number of speakers, time, boundaries, topics, speaker behavior, participants behavior, duration constraint etc.

  10. Lexicon of meeting actions(II) • Set of 17 meeting actions divided in three groups: • Single speaker dominate meeting actions • Multi speaker meeting actions • Non-verbal dominate meeting actions • Hierarchical organization of meeting actions

  11. Lexicon of meeting actions (III) Meeting actions Single speaker dominate Multi-speaker Non-verbal dominate Introduction Ending Discussion Multi discussion Break Vote Monologue Presentation Lecturing White-board Note taking Applause Laugh Silence Opening Consensus Disagreement

  12. Other aspects of meeting(User profile) • Meeting is more than a sequence of meeting actions. • User profile: age, gender, native-English speaker, profession, membership to specific group, role, speech style etc. • The user profile can be explicitly specified during the registration process or be learned during the processing of the recorded meetings • Knowledge about user may be useful on individual and group level of meeting modeling.

  13. Other aspects of meeting(Background knowledge) • Background knowledge play an important role at each level of abstraction • Background knowledge may include : agenda, written notes, presentation slides, content of white-board number of meeting participants etc.

  14. Other aspects of meeting(Target detection) • ”What John said to Peter about the programming standards?“ contains three very important aspects of the meeting. • source of the messages (John) • discussed topic (programming standards) • target (addressee) of the message (Peter)

  15. Other aspects of meeting(Target detection) • Target ( addressee) detection needs a multimodal approach (speech,gaze, gesture) “What do you think about my idea?” Gaze detection ( speaker focus of attention) or pointing at the person may help to resolve this target ambiguity • Name detection as a method for target detection • Target of the message can be a particular person, group of participants or all participants

  16. Other aspects of meetings(Target detection) • Herbert. H. Clark – Using Language speaker addressee side participant bystander all participants eavesdropper all listener

  17. Semantic model • Our idea is to develop a modular multimodal system which will use semantic approach on participant level and meeting action level. • Inputs:results of recognition process (WP2) • Speech Recognition • Gesture/Action Recognition • Gaze detection • Emotion detection • Multimodal person identification and tracking • Output: annotated sequence of meeting actions

  18. Semantic model Sequence of meeting actions Meeting Actions Recognition Module Participants multimodal behavior Background Knowledge Multimodal Interpreters Modality units Unimodal Interpreters Gaze detection Action/Gesture Recognition Speech Recognition Person /Speaker ID and Tracking Video Audio

  19. Multimodal fusion on a participant level Gaze Interpreter Action/Gesture Interpreter Speech Interpreter Gaze detection Action/Gesture Recognition Speech Recognition Person /Speaker ID and Tracking Participants multimodal behavior Multimodal Interpreter Additional Inference Modality Fusion Modality units Unimodal Interpreters

  20. Multimodal fusion on a participant level • Unimodal Interpretersmodality units 1)Action/Gesture Interpreter • participant states (sitting, standing, walking etc.) • activities ( silent, talking, laughing,voting etc.) 2) Gaze interpreter ( look at X, look away) 3) Speech Interpreter • turn-taking behavior is a basis for social interaction. • meaning representation on turn level ( turn array level) • features of an array: topic (subtopics), dialog acts (DAMSL), addressees, key words, speech form, overlapping indicator etc.

  21. Multimodal fusion on a participant level • Multimodal InterpreterMultimodal participants behavior 1) Modality fusion (semantic level) Typed feature structure for meaning representation Unification or/and rule-based approach for fusion 2) Additional inference Use additional information from user profile or background knowledge in order to obtain missing data or resolve ambiguity.

  22. Meeting actions recognition module • Hidden Markov Models • states: meeting actions • observations: semantic features from participant’s behavior representation • Participant dependent features (state, activity, talking duration, dialogue acts etc.) and common features (previous dialogue act, previous key-words etc.) • IDIAP meeting data corpus

  23. Conclusions and future direction • The main goal of our approach is to encode more semantic details at each level in other to enable browsing and querying of an archive of recorded meetings. • Larger and more natural meeting data corpus in order to prove our approach for low-level and high-level meeting actions. • Extraction of a set semantic features • Testing approach using techniques different than HMM.

More Related