370 likes | 657 Views
Face Synthesis. M. L. Gavrilova. Outline. Face Synthesis From Modeling to Synthesis Facial Expression Synthesis Conclusions. Face Synthesis. How to synthesize photorealistic images of human faces has been a fascinating yet difficult problem in computer graphics.
E N D
Face Synthesis M. L. Gavrilova
Outline • Face Synthesis • From Modeling to Synthesis • Facial Expression Synthesis • Conclusions
Face Synthesis How to synthesize photorealistic images of human faces has been a fascinating yet difficult problem in computer graphics. Here, the term “face synthesis” refers to synthesis of still images as well as synthesis of facial animations. For example, the technique of synthesizing facial expression images can be directly used for generating facial animations, and most of the facial animation systems involve the synthesis of still images.
Face Synthesis Face synthesis has many interesting applications. • In the film industry, people would like to create virtual human characters that are indistinguishable from the real ones. • In games, people have been trying to create human characters that are interactive and realistic. • There are commercially available products that allow people to create realistic looking avatars that can be used in chatting rooms, e-mails, greeting cards, and teleconferencing. • Many human-machine dialog systems use realistic-looking human faces as visual representation of the computer agent that interacts with the human user.
Face Modeling from an Image Sequence Face modeling: image matching, structure from motion, and model fitting. First, two or three relatively frontal views are selected, and some image matching algorithms are used to compute point correspondences. Point correspondences are computed either by using dense matching techniques such as optimal flow or feature-based corner matching. Second, one needs to compute the head motion and the 3D structures of the tracked points. Finally, a face model is fitted to the reconstructed 3D points. People have used different types of face model representations including parametric surfaces, linear face scans, and linear deformation vectors.
Face Modeling from an Image Sequence Liu et al. developed a face modeling system that allows an untrained user with a personal computer and an ordinary video camera to create an instantly animate his or her face model. After the matching is done, they used both the corner points from the image matching and the five feature points clicked by the user to estimate the camera motion.
Face Modeling from an Image Sequence Shan et al. proposed an algorithm, called model-based bundle adjustment, that combines the motion estimation and model fitting into a single formulation. Their main idea was to directly use the model space as a search space.
Face Modeling from an Image Sequence On the top are the front views, and on the bottom are the side views. On each row, the one in the middle is the ground truth, on the left is the result from the traditional bundle adjustment, and on the right is the result from the model-based bundle adjustment. The result of the model-based bundle adjustment is much closer to the ground truth mesh.
Face Modeling from 2 Orthogonal Views A number of researchers have proposed that we create face models from two orthogonal views: one frontal view and one side view. The frontal view provides the information relative to the horizontal and vertical axis, and the side view provides depth information. The user needs to manually mark a number of feature points on both images. The feature points are typically the points around the face features, including eyebrows, eyes, nose and mouth. The more feature points, the better the model, but one needs to balance between the amount of manual work required from the user and the quality of the model.
Face Modeling from a Single Image Liu developed a fully automatic system to construct 3D face models from a single frontal image. They first used a face detection algorithm to find a face and then a feature alignment algorithm to find face features. By assuming an orthogonal projection, they fit a 3D face model by using the linear space of face geometries. Given that there are existing face detection and feature alignment systems, implementing this system is simple. The main drawback of this system is that the depth of the reconstructed model is in general not accurate. For small head rotations, however, the model is recognizable.
Example of model generation Figure (top) shows an example where the left is the input image and the right is the feature alignment result. Figure (middle) shows the different views of the reconstructed 3D model. Figure (bottom) shows the results of making expressions for the reconstructed face model.
Outline • Face Synthesis • Face Modeling • Facial Expression Synthesis • Conclusions
Facial Expression Synthesis Physically Based Facial Expression Synthesis • One of the early physically based approaches is the work by Badler and Platt, who used a mass and spring model to simulate the skin. They introduced a set of muscles. Each muscle is attached to a number of vertices of the skin mesh. When the muscle contracts, it generates forces on the skin vertices, thereby deforming the skin mesh. A user generates facial expressions by controlling the muscle actions. • Waters introduced two types of muscles: linear and sphincter. The lips and eye regions are better modeled by the sphincter muscles. To gain better control, they defined an influence zone for each muscle so the influence of a muscle diminishes as the vertices are farther away from the muscle attachment point. Morph_Based Facial Expression Synthesis • Given a set of 2D or 3D expressions, one could blend these expressions to generate new expressions. This technique is called morphing or interpolation. This technique was first reported in Parke’s pioneer work. Beier and Neely developed a feature-based image morphing technique to blend 2D images of facial expressions. Bregler et al. applied the morphing technique to mouth regions to generate lip-synch animations.
Facial Expression Synthesis Expression Mapping • Expression mapping (also called performance-driven animation) has been a popular technique for generating realistic facial expressions. This technique applies to both 2D and 3D cases. Given an image of a person’s neutral face and another image of the same person’s face with an expression, the positions of the face features (e.g. eyes, eyebrows, mouths) on both images are located either manually or through some automated method. • Noh and Neumann developed a technique to automatically find a correspondence between two face meshes based on a small number of user-specified correspondences. They also developed a new motion mapping technique. Instead of directly mapping the vertex difference, this technique adjusts both the direction and magnitude of the motion vector based on the local geometries of the source and target model. • Liu et al. proposed a technique to map one person’s facial expression details to a different person. Facial expression details are subtle changes in illumination and appearance due to skin deformations. The expression details are important visual cues, but they are difficult to model.
Facial Expression Synthesis Expression ratio image. Left: neutral face. Middle: expression face. Right: expression Ratio image. The ratios of the RGB components are converted to colors for display purpose. Mapping a smile to Mona Lisa’s face. Left: neutral face. Middle: result from geometric warping. Right: result from ERI.
Geometry-Driven Expression Synthesis Mapping expressions to statues. A. Left: original statue. Right: result from ERI. B. Left: another statue. Right: result from ERI (Z. Zhang, MSR).
Geometry-Driven Expression Synthesis To increase the space of all possible expressions, the face is subdivided into a number of subregions. For each subregion, the geometry associated with the subregion is used to compute the subregion texture image. The final expression is then obtained by blending these subregion images together. The figure is an overview of the system. Geometry-driven expression synthesis system.
Geometry-Driven Expression Synthesis The function MotionPropagationFeaturePointSet is defined as follows:
Geometry-Driven Expression Synthesis a. Feature points. b. Face region subdivision
Geometry-Driven Expression Synthesis Example images of the male subject.
Geometry-Driven Expression Synthesis In addition to expression mapping, Zhang et al. applied their techniques to expression editing. They developed an interactive expression editing system that allows a user to drag a face feature point, and the system interactively displays the resulting image with expression details. The figure shows some of the expressions generated by the expression editing system. Expressions generated by the expression editing system.
3D synthesis Generic face mesh Placement of Control Points Sibson Coordinates Computation Deformation With Control points Target pictures
System Description • Placement of Control Points: • Very important for the final synthesis result; • Only the vertices inside are deformable • How many points? • Where to put those points? • Important features must be controlled in a specific way (such as the corners of the eyes, the nose, the mouth etc..) • Include as many facial vertices as possible • Acceptable to be pointed out manually • Computational expenses
System Description 18 control points: the fewest number to indicate those importance features, such as: eyes, eyebrows, nose, mouth etc..
System Description 28 morphing zones: Advantages: increase computational efficiency; local morphing effects: preventing incorrect deformation effects from other areas;
System Description What we know? The original 3D control points position in the generic face mesh What we get? The displacement of these control points
System Description Sibson Coordinates Computation: DFFD displacement relation: In order to strengthen or alleviate the displacement effects from different Sibson neighbors, different weights wi to Sibson coordinates. Final Weighted DFFD relation is:
System Description 2D Sibson Coordinates Computation: 3D Sibson Coordinates Computation:
System Description Deformation With Control points:
Synthesis Results Implement in Matlab Input images, generic face mesh, texture all from: FaceGen platform Generic face mesh: 7177 vertices, 6179 facets Input Image size: 400x477 pixels, 96dpi Experiments on Intel Pentium 4 CPU: 2.0 GHz with 256MB of RAM: 180s
Conclusions for Face Synthesis • One problem is how to generate face models with fine geometric details. Many 3D face modeling techniques use some type of model space to constrain the search, thereby improving the robustness. The resulting face models in general do not have the geometric details, such as creases and wrinkles. Geometric details are important visual cues for human perception. With geometric details, the models look more realistic; and for personalized face models, they look more recognizable to human users. Geometric details can potentially improve computer face recognition performance as well. • Another problem is how to handle non-Lambertian reflections. The reflection of human face skin is approximately specular when the angle between the view direction and lighting direction is close to 900.Therefore, given any face image, it is likely that there are some points on the face whose reflection is not Lambertian. It is desirable to identify the non-Lambertian reflections and use different techniques for them during relighting. • How to handle facial expressions in face modeling and face relighting is another interesting problem. Can we reconstruct 3D face models from expression images? One would need a way to identify and undo the skin deformations caused by the expression. To apply face relighting techniques on expression face images, we would need to know the 3d geometry of the expression face to generate correct illumination for the areas with strong deformations.
Future Work • Conclusion: A method for 3D facial model synthesis is proposed. With two orthogonal views of an individual face image as the input, 18 feature points are defined on the two images. Then, a Voronoi based interpolation technique, DFFDs, is utilized to deform a generic face mesh to fit the input face images. With the synthesized facial models, the same animation technique can be used to generate individual facial animation. • Future work: 1. Increase control points number, automate feature points extraction: image segmentation, edge detection techniques etc.. 2. Analyze real facial expression video data, construct a common facial expression database to drive the animation
References • S. Z. Li and A. K. Jain. Handbook of Face recognition, 2005 • Y.-L. Tian, T. Kanade, and J. Cohn. Evaluation of gabor-wavelet-based facial action unit recognition in image sequences of increasing complexity. In Proc. Of IEEE Int. Conf. on automatic face and gesture recognition, 2002 • Z. Wen and T. Huang. Capturing subtle facial motions in 3D face tracking. In Proc. Of Int. Conf. On Computer Vision, 2003 • J. Xiao, T. Kanade, and J. Cohn. Robust full motion recovery of head dynamic templates and registration techniques. In Proc. Of Int. Conf. On automatic face and gesture recognition, 2002 • Z. Liu. A fully automatic system to model faces from a single image. Microsoft research technical report, 2003 • Z. Liu, Y. Shan, and Z. Zhang. Expressive expression mapping with ratio images. In Siggraph 2001 • Q. Zhang, Z. Liu, B. Guo, and H. Shum. Geometry-driven photorealistic facial expression synthesis. In SCA 2003.