1 / 19

Michael Voit Interactive Analysis and Diagnosis Fraunhofer IITB Karlsruhe, Germany

Deducing the Visual Focus of Attention from Head Pose Estimation in Dynamic Multi-View Meeting Scenarios. Michael Voit Interactive Analysis and Diagnosis Fraunhofer IITB Karlsruhe, Germany Rainer Stiefelhagen Interactive Systems Labs Universität Karlsruhe, Germany. Motivation.

chas
Download Presentation

Michael Voit Interactive Analysis and Diagnosis Fraunhofer IITB Karlsruhe, Germany

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Deducingthe Visual Focus of Attention from Head Pose Estimationin Dynamic Multi-View Meeting Scenarios Michael Voit Interactive Analysis and Diagnosis Fraunhofer IITB Karlsruhe, Germany Rainer Stiefelhagen Interactive Systems Labs Universität Karlsruhe, Germany

  2. Motivation • Determinethetarget a personislookingat • Importantcuetorecognize a person‘soccupationandactions Deducing vFoA from Head Pose in Dynamic Meeting Scenarios

  3. Difficulties • Ideal method: trackeyegaze • Recentdevelopment in HCI / CV: • Unobtrusivesensorsetup („Smart Rooms“) • Complexenvironments (manyinteractionobjects) & „everydaysituations“ • Surveillance  Pupils not alwaysvisible in naturalenvironments! Deducing vFoA from Head Pose in Dynamic Meeting Scenarios

  4. Current State ofthe Art • vFoAusingheadorientation: • Stiefelhagen et al . Tracking Focus of Attention in Meetings. 2002. • Ba et al. Multi-Party Focus of Attention Recognition in Meetings from Head Pose and Multimodal Interfaces. 2006. • Ba et al. A CognitiveandUnsupervisedMap Adaptation Approach tothe Recognition of Focus ofAttentionfrom Head Pose. 2007. • … • Drawbacks: • vFoAhasonlybeentrackedforverycontrolled, restrictedscenarios: • Meetings withfixedsetofparticipants • Nomovement, nodynamics • Datasets onlyincludesmallnumberofallowedtargets (mostlymeetingparticipants) Deducing vFoA from Head Pose in Dynamic Meeting Scenarios

  5. This Work • Introducedynamicscenes • Movingpeopleandobjects • Variable numberofparticipants • Useunobtrusivesensorsetup • Collectfirstdatasetincludingdynamicscenes • PresentourfirstsystemtoestimatevFoAunderthesenewconditions • Adaptive mappingfromheadposetolikelygaze Deducing vFoA from Head Pose in Dynamic Meeting Scenarios

  6. Data Collection: Setup • 10 Meetings • Each ~ 10 minuteslong • Unobtrusivesensorsetup • 4 wide angle cameras in room‘scorners • Fisheye on ceiling • T-shapedmicrophonearrays • Table-top microphones Deducing vFoA from Head Pose in Dynamic Meeting Scenarios

  7. Data Collection: Dynamics • Eachmeetingconsistsof 4-5 participants: • 3 actors • 1-2 „unaware“ colleagues • Predefinedscript • All objectsarepossible focustargets Deducing vFoA from Head Pose in Dynamic Meeting Scenarios

  8. Data Collection: Annotations (1) • Head orientation: • Magneticposesensor • Annotations: • All objects in room • Observedfocustargets • Head bounding box • Currently in progress: • Upperbodyorientation • Speakeractivity Deducing vFoA from Head Pose in Dynamic Meeting Scenarios

  9. Data Collection: Annotations (2) Deducing vFoA from Head Pose in Dynamic Meeting Scenarios

  10. Multi-View Head Pose Estimation Deducing vFoA from Head Pose in Dynamic Meeting Scenarios

  11. From Head Pose to Gaze • Neurophysiology (e.g. Freedmanand Sparks): Mapping not constant! Deducing vFoA from Head Pose in Dynamic Meeting Scenarios

  12. Reasonsfor Quick Focus Changes • Interaction: • Speech, discussion • Presentation • Introductionofnewtargets (gestures, …) • Occupation Change: • Focus changesfromwhiteboardtonotebookwhilewriting down notes • Curiosity / Movement: • A targetpassesby, focusfollowsthatperson • A personentersscene • Noise comingfrom a target (generaldisturbance) Deducing vFoA from Head Pose in Dynamic Meeting Scenarios

  13. Idea: Adaptmappingfactor • Buildhistogramoverdiscretizedsetofmappingfactors k • Adapthistogrambinsaccordingtotargets‘ posteriorlikelihoods: • Model focusshifts in priors: Deducing vFoA from Head Pose in Dynamic Meeting Scenarios

  14. Experiments (so far…) • Video dataonly • Currentexperiments on dominant targetsonly • Persons (~60%) • Whiteboard (~12%) • Table (incl. Notebooks, ~12%) • Experiments on personwearingsensoronly • Upperbodyorientationfixed • Body orientationofremainingparticipantsiscurrentlybeingannotatedmanually Deducing vFoA from Head Pose in Dynamic Meeting Scenarios

  15. Results Deducing vFoA from Head Pose in Dynamic Meeting Scenarios

  16. Conclusions • vFoAimportantcuetodeterminepeoples‘ occupations • First datasetfordynamicmeetingscenes • Variable numberoftargets • Movingtargets • Large amountof potential targets • Constant mappingfromheadposetogaze suboptimal • Adaptive mappingenhancesresults (10%+) • vFoAis VERY ambiguouswithoutregardingfurthermodalities: • Gestures, Speech, … • Definingnecessary „levelofdetail“ • Currentworkincludesestimatingupperbodyorientation • Extendsystemto all participants • Computejointviewingareas • Future work: • completingannotations • Includemoremodalities Deducing vFoA from Head Pose in Dynamic Meeting Scenarios

  17. ThankYou ! Deducing vFoA from Head Pose in Dynamic Meeting Scenarios

  18. From Head Pose to Gaze (2) Deducing vFoA from Head Pose in Dynamic Meeting Scenarios

  19. Multi-View Head Pose Estimation Deducing vFoA from Head Pose in Dynamic Meeting Scenarios

More Related