Automated Face Tracking and Recognition

Presentation Transcript

  1. Automated Face Tracking and Recognition Curt Hesher Anuj Srivastava Gordon Erlebacher

  2. Overview • Review of Past Research in Face Tracking and Recognition • Data Acquisition and Representation • Face Tracking Using Images Generated from Geometry • Face Recognition Using Range Images • Conclusions and future work.

  3. A Review of Face Tracking and Recognition • Survey papers • Past research • Commercial implementations • Persistent challenges

  4. Survey Papers • Nonconnectionist (Samal and Iyengar) – Approaches dealing with the relative position of feature points (distance between eyes, corners of the mouth, etc.) derived from certain pixel values • Connectionist (Valentin et al.) – Approaches that derive characteristics from the whole face image (i.e., PCA) • General (Chellappa et al., Barrett, Zhao et al.) – Approaches categorized as neural, statistical, and feature based

  5. Past Research • Start with 2D images • LDA, KDA, PCA, SVM, EBGM • Neural, statistical, feature analysis

  6. Commercial Implementations • Numerous implementations • Statistical, neural, and feature based • Government sponsored tests (FRVT 2000 and 2002) show accuracy between 20% and 90% depending on the environment • Robust face recognition is still unsolved

  7. Persistent Challenges • Variation from pose • Variation from lighting • Occlusions • Poor image quality • Techniques beginning with 2D data have been heavily researched. A new imaging modality should be researched: 3D Imaging

  8. A Novel Approach • Start with 3D data • Use the additional information present in 3D data for tracking and recognition

  9. Data Acquisition and Representation • Minolta Vivid 700 3D scanner • Meshes captured using 3D camera • ½ second capture time • Subject motion avoided • Light independent data capture of geometry

  10. Data Acquisition and Representation • Sample points on the surface of an object and connect them via lines to form a mesh • 200x200 geometry res. • 400x400 texture res. • About 10K points sampled from a face • About 40K pixels sampled from a face

  11. Tracking • Algorithm • Experiment • Conclusions

  12. Algorithm

  13. Algorithm • Segmentation and recognition are not addressed • Mesh is manually chosen • Video is manually chosen (subject is face forward in the first frame and at a reasonable distance from the camera)

  14. Algorithm • Tracking through synthesis • Cost function (C) indicates likeness of estimate (E) to target (T) • Follow the gradient of the cost function to achieve alignment

  15. Experiment • Synthetic and real target video • Synthetic target initially used to avoid nuisance variables (i.e., lighting, noise, etc.) • Parameters for tracking are chosen manually and refined by observation • (add video tracking example) • Successfully tracks around 20 to 50 frames before failing

  16. Experiment • Successfully tracks around 20 to 50 frames before failing

  17. Conclusions • Does not handle background clutter • Does not handle lighting variations • Computationally expensive

  18. Principle Component Analysis of Range Images for Face Recognition

  19. Facial Identification • Many current modalities of investigation (intra-feature distance, geometrical parameterization, reflectance) • Outstanding issues in previous modalities (reflectance, orientation) • New modality, Range Imaging.

  20. What are Range Images • Range Images are generated from a mesh • Meshes captured using Minolta Vivid 700 3D camera

  21. Data Collected • 115 persons • 6 facial expressions per person • 690 3D facial images • Subset of 37 persons under 6 expressions used in current experiment • Some manual correction to data (hole patching)

  22. Range Image Generation • Traverse each triangle in the mesh • Orthographically project depth values onto the range image plane

  23. Range Image Registration Automatic Preprocessing • Orientation – rotation in the image plane • Translation – translation in the image plane • Depth – translation perpendicular to the image plane

  24. Recognition using Range Images • Training data – a subset of the experimental data set is used to learn the variability in facial range images • Testing data – remaining faces used in attempted recognition • Dimension reduction – Principle Component Analysis (PCA) used to reduce facial range images to 10 dimensional vectors

  25. Dimension Reduction • Twenty largest Eigen values (above) • Three Eigen vectors from three largest Eigen values (right)

  26. Testing: Nearest Neighbor Algorithm • Use the Euclidian distance between coefficients (projection of the image in dominant subspace – first ten Eigen vectors) • Nearest neighbor (image from training set with most similar projection) chosen as match

  27. Identification Results • Correct identification

  28. Identification Results • Incorrect identification

  29. Identification Results • Incorrect identification

  30. Identification Results Training Faces

  31. Future Research • Other projection techniques (Fisher Discrimination Method) • Joint recognition using range and texture images

