500 likes | 517 Views
This study explores object detection by finding instances of objects with a chosen sparse structure of statistical dependency. The proposed model employs Semi-Naïve Bayes to select input variables subsets automatically. Goals include automatic grouping of subsets and pair-wise measurements to enhance detection accuracy.
E N D
Object Detection Using Semi-Naïve Bayes to Model Sparse Structure Henry Schneiderman Robotics InstituteCarnegie Mellon University
Object Detection • Find all instances of object X (e.g. X = human faces)
Chosen variable Sparse Structure of Statistical Dependency Chosen variable Chosen variable
Sparse Structure of Statistical Dependency Chosen coefficient Chosen coefficient Chosen coefficient
Sparse Structure of Statistical Dependency Chosen coefficient Chosen coefficient Chosen coefficient
Detection using a Classifier “Object is present” (at fixed size and alignment) Classifier “Object is NOT present”(at fixed size and alignment)
e.g. S1 = (x21, x34, x65, x73, x123) S2 = (x3, x8, x17, x65, x73, x111) Proposed Model: Semi-Naïve Bayes input variables subsets • Kononenko (1991), Pazzini (1996), Domingos and Pazzini (1997), Rokach and Maimon (2001)
Goal: Automatic subset grouping S1 = (x21, x34, x65, x73, x123) S2 = (x3, x8, x17, x65, x73, x111) . . . Sn = (x14, x16, x17, x23, x85, x101, x103, x107)
qfunctions nfunctions, n << q Approach: Selection by Competition x1 x2 x3 . . . xm Generate q candidate subsets S1 S2 . . . Sq Train q log likelihood function log [p1(S1|w1) / p1(S1|w2)] log [p2(S2|w1) / p2(S2|w2)] . . . log [pq(Sq|w1) / pq(Sq|w2)] Select combination of n candidates log [pj1(Sj1|w1) / pj1(Sj1|w2)] log [pj2(Sj2|w1) / pj2(Sj2|w2)] . . . log [pjn(Sjn|w1) / pjn(Sjn|w2)] H(x1,…,xr) = log [pj1(Sj1|w1) / pj1(Sj1|w2)]+log [pj2(Sj2|w1) / pj2(Sj2|w2)] +. . .+ log [pjn(Sjn|w1) / pjn(Sjn|w2)]
Approach: Selection by Competition x1 x2 x3 . . . xm Generate q candidate subsets S1 S2 . . . Sq Train q log likelihood function log [p1(S1|w1) / p1(S1|w2)] log [p2(S2|w1) / p2(S2|w2)] . . . log [pq(Sq|w1) / pq(Sq|w2)] Select combination of n candidates log [pj1(Sj1|w1) / pj1(Sj1|w2)] log [pj2(Sj2|w1) / pj2(Sj2|w2)] . . . log [pjn(Sjn|w1) / pjn(Sjn|w2)] H(x1,…,xr) = log [pj1(Sj1|w1) / pj1(Sj1|w2)]+log [pj2(Sj2|w1) / pj2(Sj2|w2)] +. . .+ log [pjn(Sjn|w1) / pjn(Sjn|w2)]
Generation of Subsets • “modeling error for assuming independence” q is size of the subset
Generation of Subsets • Selection of variables - “discrimination power” q is size of the subset
Pair-Wise Measurement • pair-wise measurements Pair-affinity
Measure over a Subset Subset-affinity
Generation of Candidate Subsets x1 x2 x3 . . . . . . . . . . . . . . . . . . . . . . . . xm C(x1, x2) C(x1, x3) . . . . . . . . C(xm-1, xm) Heuristic search and selective evaluation of D(Si) S1 S2 . . . . . . . . . . . . . . . . . . Sp
subset size vs. modeling power • Model complexity limited by number of training examples, etc. • Examples of limited modeling power • 5 modes in a mixture model • 7 projection onto principal components
Approach: Selection by Competition x1 x2 x3 . . . xm Generate q candidate subsets S1 S2 . . . Sq Train q log likelihood function log [p1(S1|w1) / p1(S1|w2)] log [p2(S2|w1) / p2(S2|w2)] . . . log [pq(Sq|w1) / pq(Sq|w2)] Select combination of n candidates log [pj1(Sj1|w1) / pj1(Sj1|w2)] log [pj2(Sj2|w1) / pj2(Sj2|w2)] . . . log [pjn(Sjn|w1) / pjn(Sjn|w2)] H(x1,…,xr) = log [pj1(Sj1|w1) / pj1(Sj1|w2)]+log [pj2(Sj2|w1) / pj2(Sj2|w2)] +. . .+ log [pjn(Sjn|w1) / pjn(Sjn|w2)]
Log-likelihood function = Table Si = (xi1, xi2, . . ., xiq) vector quantization table look-up
Sub-Classifier Training by Counting fi Pi (fi |w1) fi Pi(fi |w2)
Example of VQ xi1 xi2 xi3 . . . xiq projection on to 3 principal components c1 c2 c3 quantization to m levels z1 z2 z3 f = z1m0 + z2m1 + z3m2
Approach: Selection by Competition x1 x2 x3 . . . xm Generate q candidate subsets S1 S2 . . . Sq Train q log likelihood function log [p1(S1|w1) / p1(S1|w2)] log [p2(S2|w1) / p2(S2|w2)] . . . log [pq(Sq|w1) / pq(Sq|w2)] Select combination of n candidates log [pj1(Sj1|w1) / pj1(Sj1|w2)] log [pj2(Sj2|w1) / pj2(Sj2|w2)] . . . log [pjn(Sjn|w1) / pjn(Sjn|w2)] H(x1,…,xr) = log [pj1(Sj1|w1) / pj1(Sj1|w2)]+log [pj2(Sj2|w1) / pj2(Sj2|w2)] +. . .+ log [pjn(Sjn|w1) / pjn(Sjn|w2)]
Candidatelog-likelihoodfunctions h1(S1) h2(S2) . . . hP(SP) Evaluate on training data E1,w1 E1,w2 E2,w1 E2,w2 . . . Ep,w1 Ep,w2 Evaluate ROCs ROC1ROC2 . . . ROCP Order top Q log-likelihoodfunctions hj1(Sj1) hj2(Sj2) . . . hjQ(SjQ)
hj1(Sj1) + h1(S1) . . . hjQ(SjQ) + hp(Sp) Form pQ pairsof log-likelihoodfunctions Sum Evaluations Ej1,w1+E1,w1Ej1,w2+ E1,w2 . . . EjQ,w1+Ep,w1 EjQ,w2+ Ep,w2 Evaluate ROCs ROC1 . . . ROCQP Order top Q pairs of log-likelihoodfunctions hk1,1(Sk1,1) + hk1,2(Sk1,2) . . . hkQ,1(SkQ,1) + hkQ,2(SkQ,2) . . . Repeat for n iterations
Cross-Validation Selects Classifier Q Candidates: H1(x1, x2, . . ., xr) = hk1,1(Sk1,1) + hk1,2(Sk1,2) +. . .+ hk1,n(Sk1,n) . . . HQ(x1, x2, . . ., xr) = hkQ,1(SkQ,1) + hkQ,2(SkQ,2) + . . . + hQ,n(SkQ,n) H1(x1, x2, . . ., xr) . . . HQ(x1, x2, . . ., xr) Cross-validation H*(x1, x2, . . ., xr)
Evaluation of Classifier “Object is present” (at fixed size and alignment) Classifier “Object is NOT present”(at fixed size and alignment)
1) Compute feature values f1 = #5710 f2 = #3214 fn = #723
P2( #3214 | w1) Pn( #723 | w1) log log = 0.03 = 0.23 P2( #3214 | w2) Pn( #723 | w2) 2) Look-Up Log-Likelihoods P1( #5710 | w1) f1 = #5710 log = 0.53 P1( #5710 | w2) f2 = #3214 fn = #723
P2( #3214 | w1) Pn( #723 | w1) log log = 0.23 = 0.03 P2( #3214 | w2) Pn( #723 | w2) 3) Make Decision P1( #5710 | w1) log = 0.53 P1( #5710 | w2) > l 0.53 + 0.03 + . . . + 0.23 S <
Detection using a Classifier “Object is present” (at fixed size and alignment) Classifier “Object is NOT present”(at fixed size and alignment)
View-based Classifiers FaceClassifier #1 FaceClassifier #2 FaceClassifier #3
Search in scale Detection: Apply Classifier Exhaustively Search in position
P2( #3214 | w1) Pn( #723 | w1) log log = 0.23 = 0.03 P2( #3214 | w2) Pn( #723 | w2) Decision can be made by partial evaluation P1( #5710 | w1) log = 0.53 P1( #5710 | w2) > l 0.53 + 0.03 + . . . + 0.23 S <
Detection Computational Strategy Apply log [p1(S1|w1) / p1(S1|w2)]exhaustively to scaled input image Apply log [p3(S3|w1) / p3(S3|w2)]further reduced search space Apply log [p2(S2|w1) / p2(S2|w2)]reduced search space Computational strategy changes with size of search space
Compute M2 feature values Look-up M2 log-likelihood values Repeat for N2 Candidates Candidate-Based Evaluation
Compute N2 + M2 +2MN feature values Look-up M2 log-likelihood values Repeat for N2 Candidates Feature-Based Evaluation
Adaboost using confidence-rated predictions [Shapire and Singer, 1999] Cross-validationimages Images that donot contain object Bootstrapping [Sung and Poggio, 1995] Cascade Implementation Create candidate subsets Train candidate log-likelihood functions Training images of non-object Training imagesof object Select log-likelihood functions Retrain selected log-likelihoodfunctions using Adaboost Determine detection threshold Automatically select non-objectexamples for next stage Increment stage
Frontal Face Detection • MIT-CMU Frontal Face Test Set [Sung and Poggio, 1995; Rowley, Baluja and Kanade, 1997] • 180 ms 300x200 image • 400 ms 300x500 image • Top Rank Video TREC 2002 Face Detection • Top Rank 2002 ARDA VACE Face Detection algorithm evaluation AMD Athalon 1.2GHz
Face & Eye Detection for Red-Eye Removal from Consumer Photos CMU Face Detector
Eye Detection • Experiments performed independently at NIST • Sequested data set: 29,627 mugshots • Eyes correctly located (radius of 15 pixels) 98.2% (assumed one face per image) • Thanks to Jonathon Phillips, Patrick Grother, and Sam Trahan for their assistance in running these experiments
Realistic Facial Manipulation:Earring Example With Jason Pinto
Summary of Classifier Design • Sparse structure of statistical dependency in many image classification problem • Semi-naïve Bayes Model • Automatic learning structure of semi-naïve Bayes classifier: • Generation of many candidate subsets • Competition among many log-likelihood functions to find best combination CMU on-line face detector:http://www.vasc.ri.cmu.edu/cgi-bin/demos/findface.cgi