Digital Image Processing

Digital Image Processing Chapter 16: Face Detection Prepared by: Eng. Mohamed Hassan Supervised by: Dr. AshrafAboshosha http://www.icgst.com/A_Aboshosha.html editor@icgst.com Tel.: 0020-122-1804952 Fax.: 0020-2-24115475

The face detection techniques • Feature-Based Approach • Skin color and face geometry • Detection task is accomplished by • Distance, angles and area of visual features • Image-Based Approach • As a general recognition system

The face detection techniques • Feature-Based Approach • Low-Level Analysis • Segmentation of visual features • Feature Analysis • Organized the features into • 1. Global concept • 2. Facial features • Active Shape Models • Extract the complex & non-rigid feature Ex: eye pupil, lip tracking.

Low-Level Analysis:Segmentation of visual features • Edges: (The most primitive feature) • Trace a human head outline. • Provide the information • Shape & position of the face • Edge operators • Sobel • Marr-Hildreth • first and second derivatives of Gaussians

Low-Level Analysis:Segmentation of visual features • The steerable filtering 1. Detection of edges 2. Determining the orientation 3. Tracking the neighboring edges • Edge-detection system • 1. Label the edge • 2. Matched to a face model • 3. Golden ratio

Low-Level Analysis:Segmentation of visual features • Gray information • Facial feature ( eyebrows , pupils …) • Application • Search an eye pair • Find the bright pixel (nose tips) • Mosaic (pyramid) images

Segmentation of visual features: Color Based Segmentation • Color information • Difference races? • Different skin color gives rise to a tight cluster in color space. • Color models • Normalized RGB colors • A color histogram for a face is made • Comparing the color of a pixel with respect to the r and g. Why normalized ? Brightness change

Low-Level Analysis:Segmentation of visual features • HSI color model • For large variance among facial feature clusters [106]. • Extract lips, eyes, and eyebrows. • Also used in face segmentation • YIQ • Color’s ranging from orange to cyan • Enhance the skin region of Asians [29]. • Other color models • HSV, YES, CIE-xyz … • Comparative study of color space [Terrilon 188]

Low-Level Analysis:Segmentation of visual features • Color segmentation by color thresholds • Skin color is modeled through • Histogram or charts (simple) • Statistical measures (complex) • Ex: • Skin color cluster can be represented as Gaussian distribution [215] • Advantage of Statistical color model • The model is updatable • More robust against changes in environment

Low-Level Analysis:Segmentation of visual features • The disadvantage: • Not robust under varying lighting condiction

Color based segmentation: Skin model construction (Example) The original image was taken from http://nn.csie.nctu.edu.tw/face-detection/ppframe.htm

Low-Level Analysis:Segmentation of visual features • Motion information • a face is almost always moving • Disadvantages: • What if there are other object moving in the background. • Four steps for detection • Frame differencing • Thresholding • Noise removal • Locate the face

Motion-Based segmentation: • Motion estimation • People are always moving. • For focusing of attention • discard cluttered, static background • A spatio-temporal Gaussian filter can be used to detect moving boundaries of faces.

Motion-Based segmentation:

The face detection techniques • Image-Based Approach • Linear Subspace Methods • Neural Networks • Statistical Approaches

The face detection techniques • Face interface • Face detection • Face recognition Face database Output: Mr.Chan Prof..Cheng Face detection Face recognition

Face detection • To detect faces in an image (Not recognize it yet) • Challenges • A picture has 0,1 or many faces • Faces are not the same: with spectacles, mustache etc • Sizes of faces vary • Available in most digital cameras nowadays • The simple method • Slide a window across the window and detect faces • Too slow, pictures have too many pixels (1280x1024=1.3M pixels)

Evaluation of face detection • Detection rate • Give positive results in locations where faces exist • Should be high > 95% • False positive rate • The detector output is positive but it is false (there is actually no face).Definition of False positive: A result that is erroneously positive when a situation is normal. An example of a false positive: a particular test designed to detect cancer of the toenail is positive but the person does not have toenail cancer. (http://www.medterms.com/script/main/art.asp?articlekey=3377) • Should be low <10-6 • A good system has • High detection rate • Low false positive rate False positive result

Exercise • What are the detection rate and false detection rate here? • Answer • detection rate=(8/9)*100% • false detection rate=(1/9)*100% 9 faces in the picture, 8 are correctly detected. 1 window reported to have face is in fact not a face False positive result

The Viola and Jones method • The most famous method • Training may need weeks • Recognition is very fast, e.g. real-time for digital cameras. • Techniques • Integral image for feature extraction • Ada-Boost for feature selection • Attentional cascade for fast rejection of non-face sub-windows

Image Features ref[3] “Rectangle filters” Rectangle_Feature_value f= ∑ (pixels in white area) – ∑ (pixels in shaded area)

Exercise • Find the Rectangle_Feature_value (f) of the box enclosed by the dotted line • Rectangle_Feature_value f= • ∑ (pixels in white area) – ∑ (pixels in shaded area) • f=(8+7)-(0+1) • =15-1= 14

Example: A simple face detection method using one feature • Rectangle_Feature_value f • f= ∑(pixels in white area) – ∑ (pixels in shaded area) • If (f) is large it is face ,i.e. • if (f)>threshold, then • face • Else • non-face Result This is a face:T he eye-part is dark, the nose-part is bright So f is large, hence it is face This is not a face. Because f is small

How to find features fasterIntegral images fast calculation method [Lazebnik09 ] The integral image = sum of all pixel values above and to the left of (x,y) Can be found very quickly (x,y)

Computing the integral image [Lazebnik09 ]

Cumulative row sum: s(x, y) = s(x–1, y) + i(x, y) Integral image: ii(x, y) = ii(x, y−1) + s(x, y) MATLAB: ii = cumsum(cumsum(double(i)), 2); Computing the integral image [Lazebnik09 ] ii(x, y-1) s(x-1, y) i(x, y)

Calculate sum within a rectangle A,B,C,D are the values of the integral images at the corners of the rectangle R. The sum of image values inside R is: Area_R = A – B – C + D If A,B,C,D are found , only 3 additions are needed to find Area_R D B A C Rectangle R Area= Area_R

Why do we need to find pixel sum of rectangles?Answer: We want to get face features You may consider these features as face features Two eyes= (Area_A-Area_B) Nose =(Area_C+Area_E-Area_D) Mouth =(Area_F+Area_H-Area_G) They can be different sizes, polarity and aspect ratios A B C D E F G H

Face feature and example Pixel values inside The areas Shaded area -1 Integral Image +2 White area • F=Feat_val = • pixel sum in shared area - pixel sum in white area • Example • Pixel sum in white area= 216+102+78+129+210+111=846 • Pixel sum in shared area= 10+20+4+7+45+7=93 • Feat_val=F=846-93 • If F>threshold, • feature=+1 • Else • feature=-1 End if; • We can choose threshold =768 , so feature is +1. A face

Exercise 1?Definition: Area at X =pixel sum of the area from top-left corner to X= Area_X Top-left corner • Find the feature output of this image. • Area_D=1 • Area_B=1+2+3=6 • Area_C =1+3=4 • Area_A=1+2+3+3+4+6=19 • Area_E=? 1+3+5=9 • Area_F=? 1+2+3+3+4+6+5+2+4=30 • Pixel sum of the area inside the box ABDC= • Area_A - Area_B - Area_C +Area_D =? 19-6-4+1=10 • Pixel sum of the area inside the box EFAC= • ?Area_F-Area_A-area_E+Area_C= • 30-19-9+4=6 • Feature of EFBD= • (white area-shaded area)=? B D C A E F

4 basic types of features for white_area-gray_area Type) Rows x columns Type 1) 1x2 Type 2) 2x1 Type 3) 1x3 Type 4) 3x1 Type 5) 2x2 • Each basic type can have difference sizes and aspect ratios.

Feature selection [Lazebnik09 ] For a 24x24 detection region, the number of possible rectangle features is ~160,000! Exercise Name the types of the feature (type 1,2,3,4,5) of the features in the left figure Some examples and their types Fill in the types for the 2nd, 3rd rows 1 2 3 5 4 Type 1) 2) 3) 4) 5)

Exercise 2? • Still keeping the 5 basic features types (1,2,3,4,5) • Find the number of features for a resolution of 36 x36 windows • Answer: 704004, explain your answer.

The detection challenge Use 24x24 base window For y=1;y<=1024;y++ For x=1;x<=1280;x++{ Set (x,y) = the left top corner of the 24x24 sub-window For the 24x24 sub-window, extract 162,336 features and see they combine to form a face or not. } Conclusion : too slow (x,y) 24x24Sub-window 1280 X-axis (1,1) Y-axis 1024

Solution to make it efficient • The whole 162,336 feature set is too large • Solution: select good features to make it more efficient • Use: “Boosting” • Boosting • Combine many small weak classifiers to become a strong classifier. • Training is needed

Boosting for face detection Define weak learners based on rectangle features value of rectangle feature threshold window Pt= polarity{+1,-1}

AdaBoost training E.g. Collect 5000 faces, and 9400 non-faces. Different scales. Use AdaBoost for training to build a strong classifier. Pick suitable features of different scales and positions, pick the best few. (Take months to do , details is in [Viola 2004] paper) Testing Scan through the image, pick a window and rescale it to 24x24, Pass it to the strong classifier for detection. Report face, if the output is positive Face detection using Adaboost

To improve false positive rate:Attentional cascade Cascade of many AdaBoost strong classifiers Begin with with simple classifiers to reject many negative sub-windows Many non-faces are rejected at the first few stages. Hence the system is efficient enough for real time processing. Input image True True Adaboost Classifier1 Adaboost Classifier2 Adaboost Classifier3 True Face found False False False Non-face Non-face Non-face

Input image True True Adaboost Classifier1 Adaboost Classifier2 Adaboost Classifier3 True Face found False False False Non-face Non-face Non-face An example • More features for later stages in the cascade [viola2004] type3 type2 2 features 10 features 25 features 50 features…

Attentional cascade Chain classifiers that are progressively more complex and have lower false positive rates: % False Pos 0 50 vs false neg determined by 0 100 % Detection Input image True True Adaboost Classifier1 Adaboost Classifier2 Adaboost Classifier3 True Face found False False False Non-face Non-face Non-face Receiver operating characteristic False positive rate

Attentional cascade [viola2004] Detection rate for each stage is 0.99 , for 10 stages, overall detection rate is 0.9910 ≈ 0.9 False positive rate at each stage is 0.3, for 10 stages false positive rate =0.310 ≈ 6×10-6) Input image True True Adaboost Classifier1 Adaboost Classifier2 Adaboost Classifier3 True Face found False False False Non-face Non-face Non-face

Detection process in practice [smyth2007] • Use 24x24 sub-window • Scaling • scale the detection (not the input image) • Features evaluated at scales by factors of 1.25 at each level • Location : move detector around the image (1 pixel increments) • Final detections • A real face may result in multiple nearby detections (merge them to become the final result)

Skin detection • Skin pixels have a distinctive range of colors • Corresponds to region(s) in RGB color space • Skin classifier • A pixel X = (R,G,B) is skin if it is in the skin (color) region • How to find this region? skin

Skin detection • Learn the skin region from examples • Manually label skin/non pixels in one or more “training images” • Plot the training data in RGB space • skin pixels shown in orange, non-skin pixels shown in gray • some skin pixels may be outside the region, non-skin pixels inside.

Skin classifier • Given X = (R,G,B): how to determine if it is skin or not? • Nearest neighbor • find labeled pixel closest to X • Find plane/curve that separates the two classes • popular approach: Support Vector Machines (SVM) • Data modeling • fit a probability density/distribution model to each class

Probability • X is a random variable • P(X) is the probability that X achieves a certain value • called a PDF • probability distribution/density function • a 2D PDF is a surface • 3D PDF is a volume continuous X discrete X

Probabilistic skin classification Model PDF / uncertainty Each pixel has a probability of being skin or not skin Skin classifier Given X = (R,G,B): how to determine if it is skin or not? Choose interpretation of highest probability Where do we get and ?

Learning conditional PDF’s We can calculate P(R | skin) from a set of training images It is simply a histogram over the pixels in the training images each bin Ri contains the proportion of skin pixels with color Ri This doesn’t work as well in higher-dimensional spaces. Why not? • Approach: fit parametric PDF functions • common choice is rotated Gaussian • center • covariance

Learning conditional PDF’s We can calculate P(R | skin) from a set of training images But this isn’t quite what we want Why not? How to determine if a pixel is skin? We want P(skin | R) not P(R | skin) How can we get it?

Digital Image Processing