100 likes | 115 Views
CAMEO: Meeting Understanding. Prof. Manuela M. Veloso, Prof. Takeo Kanade Dr. Paul E. Rybski, Dr. Fernando de la Torre, Dr. Brett Browning, Raju Patil, Carlos Vallespi, Scott Lenser, Betsy Ricker, Francesco Tamburrino, Colin McMillen, Sonia Chernova CALO: Physical Awareness
E N D
CAMEO: Meeting Understanding Prof. Manuela M. Veloso, Prof. Takeo Kanade Dr. Paul E. Rybski, Dr. Fernando de la Torre, Dr. Brett Browning, Raju Patil, Carlos Vallespi, Scott Lenser, Betsy Ricker, Francesco Tamburrino, Colin McMillen, Sonia Chernova CALO: Physical Awareness Computer Science Department /The Robotics Institute School of Computer Science Carnegie Mellon University
Each camera is hand-calibrated only once to compensate for radial distortion CAMEO : Camera Assisted Meeting Event Observer • Robust multi-person PA capture device • Contributions • Mosaic generation • Person tracking • Face recognition • Activity recognition • Logging/modeling Must effectively operate in unstructured environments.
Additional filtering based on shape and color templates Register New Person: Person ID, Face histogram Face Center (x,y), Face width, Face height “Omega” head and shoulder template Person Tracking : Mean Shift Based Color Tracking Henry Schneiderman. “Feature-Centric Evaluation for Cascaded Object Detection. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2004. Henry Schneiderman. “Learning on Restricted Bayesian Network for Object Detection.” IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2004.
2) Normalize for geometry and illumination 3) Cluster the most discriminating face examples Face Recognition: Training • 1) Capture visual data Multiple face discrimination Real-time performance
Research challenge: Face subspace is multi-modal Use Iterative Majorization to approximate B Face Recognition:Non-Linear Oriented Discriminant Analysis Project clustered images into a lower-dimensional subspace to speed recognition Find transformation matrix B that MAXIMIZES the Kullback-Leibler divergence between clusters among classes
Face Recognition: Results • Each new face is projected into subspace and compared against the trained examples • Closest match, via Mahalanobis distance, determines class membership • 95% recognition rate with training database of 11 subjects
Person action sequences can be represented as a simple finite state machine. • State transitions are encoded as a dynamic Bayesian network in a HMM structure. • Current person state is a function of observed human activity and previous state. Global meeting state is inferred from aggregate of person activity. Inferred state from classifier Inferring Activity from Observation Face tracker captures the (x,y) positions of faces in the image over time.
bring [carlos, computer] bring [carlos, cameo] set_up [carlos, computer] use [carlos, computer] set_up [carlos, cameo] use [carlos, cameo] give_demo [carlos, face_recognition] ask_question [fernando, face_recognition] answer_question [carlos, fernando, face_recognition] give_demo [raju, tracking] ask_question [carlos, tracking] answer_question [raju, carlos, tracking] give_demo [carlos, face_detection] ask_question [raju, face_detection] answer_question [carlos, raju, face_detection] ask_question [fernando, face_detection] answer_question [raju, fernando, face_detection] remove [carlos, computer] remove [carlos, cameo] leave [jon] leave [raju] leave [fernando] leave [carlos] leave [daniel] Logging/Replay/Towards Learning Tracked person data is recorded for off-line activity analysis and learning of dynamics. The recorded logs can be replayed back through CAMEO. Model-based simulation generates high-level state descriptions of group activies. Data-based simulation generates low-level “frame-by-frame” individual person activity state descriptions.