Detection, Segmentation, and Pose Recognition of Hands in Images

Detection, Segmentation, and Pose Recognition ofHands in Images by Christopher Schwarz Thesis Chair: Dr. Niels da Vitoria Lobo

Outline • Introduction • Detection and Segmentation • Line Finding • Curve Finding • Detection • Grouping • Results • Pose Recognition • Preprocessing • Matching • Results • Discussions and Conclusions

Introduction • Hands present an exciting challenge for Computer Vision researchers. • Foils traditional object detection due to nonrigidity and 21 DoF • Uses: • Surveillance applications: • Gang signs, obscene gestures, drawing of a weapon • Human-Computer Interaction • Alternative input devices, motion capture, augmented reality.

Terminology • Detection: Find presence of target • Segmentation: Separate known target from background • Pose Recognition: Determine what pose or posture a hand is in.

Related Work • Huang [2000] • Athitsos and Sclaroff [2003] • Kölsch and Turk [2004] • Baris Caglar [2005]

Part 1: Detection and Segmentation Detection and Segmentation Outline Input Image • High-resolution images • Monochromatic images • Straight fingers • Open fingers Generate Line Sketch Find Curves Find Candidate Fingers Group and Revisit

Part 1: Detection and Segmentation Line Sketch Image • Use a Customized Line Finder • Modified Burns • Replace line combination with iterative method • Add a “cost of fit” measure per line • Union results of running Line Finder over 5 varying inputs to obtain Line Sketch • 4 varying scale • 1 “Double Canny” input • Large-gaussian Canny over output of small-gaussian canny to divide textured regions from untextured regions

Part 1: Detection and Segmentation Line Finder • Iterative Joining of Lines • Find line segments • Find nearby, almost-parallel line pairs • If pair meets thresholds, combine them • Rejoins lines split from angle thresholds or gaps in the edge input.

Part 1: Detection and Segmentation Line Finder • Cost of Fit Measure output with each line • Cost of fitting line model to underlying data These lines will have a higher Cost of Fit

Part 1: Detection and Segmentation Line Sketch Unioned lines of length >= 15 Input image Unioned Components: Blur 0 Blur 1 Blur 2 Half-Size Double Canny

Part 1: Detection and Segmentation Line Sketch Examples

Part 1: Detection and Segmentation Curve Finder • Second input to algorithm • Discovers curves that may represent fingertips • See Jan Prokaj’s thesis: Scale Space Based Grammar for Hand Detection • Model:

Part 1: Detection and Segmentation Curve Finder Examples

Part 1: Detection and Segmentation FingerFinder Pseudocode For each pair of lines if pair meets criteria for all curves nearby curves if curve meets criteria add fingerCandidate

Part 1: Detection and Segmentation Finger Candidate Criteria • “Finger Score” based on empirically found thresholds • Criteria • Geometric • Other

Part 1: Detection and Segmentation Geometric Criteria • 11 tests measuring how well a line pair and a curve approximates target configuration:

Part 1: Detection and Segmentation Non-Geometric Criteria • Line Inaccuracy: Measure of line curvature found during line finding • Canny Density: Amount of edge pixels detected in area. Variance in Canny Density: Sparse finger regions against cluttered background

Part 1: Detection and Segmentation Results First row: Input images Second row: Detected candidates

Part 1: Detection and Segmentation Grouping Candidate Fingers • Find finger groups possibly within the same hand using: • Locations, using Euclidian distance • Region intensities, comparing median values • Revisit weaker candidates to reinstate if supported by neighbors

Part 1: Detection and Segmentation Results First row: Input images Second row: "Strong" candidates before grouping Third row: Detected fingers, including those re-added during grouping

Part 1: Detection and Segmentation Grouping Result Breakdown • Results show detections from all groups • Often, individual groups divide false from true positives

Part 1: Detection and Segmentation Grouping Result Breakdown

Part 2: Pose Recognition Pose Recognition Goals Segmentation-based method using a database and an input contour • Assumes: • High-resolution • Open fingers

Part 2: Pose Recognition Flowchart of Our Method

Part 2: Pose Recognition Preprocessing Preprocessing is identical for the test and every database image. • Erode • Dilate • Compare with the original to find protrusions. Input contour silhouette

Part 2: Pose Recognition Preprocessing Ignore tiny protrusions as palm Remove palm Use K-Means clustering to find center of palm from wrist-palm segment Count “finger” segments and find average direction

Part 2: Pose Recognition Preprocessing Examples • Matching takes test and set of database images processed in this way

Part 2: Pose Recognition Matching Phase Overview • Chamfer Distance • Segment-Based Matching Matching via sum of two distance measures:

Part 2: Pose Recognition Chamfer Distance • Numerical similarity between edge images • For each point in X, find nearest point in Y • The average is the chamfer distance

Part 2: Pose Recognition X Y Chamfer Distance Direction c(X,Y) != c(Y,X) c(X,Y) < c(Y,X) “Undirected” Chamfer = c(X,Y) + c(Y,X)

Part 2: Pose Recognition Segment Based Matching:Overview • Generate CODAA Vector for every pair of test segment and model segment. • Vector contains five segment comparators • Rank comparator vectors • Rank database images with sum of comparator rankings

Part 2: Pose Recognition Segment Based Matching:CODAA Vectors

Part 2: Pose Recognition Segment Based Matching • Score each CODAA vector via progressive thresholds of the five values. • Rank vectors according to scores • For each model image segment, find match in test image with highest score • For each segment in test image, find match in model image with highest score • Sum “forward” and “reverse” measures • Divide by number of fingers • Rank model images by score

Part 2: Pose Recognition Combination • Combine results of Chamfer Distance and SBM by summing the Log (base 2) of a model’s rank in each measure. • Rank models by this combined score • Filter known-incorrect models: • Incorrect finger count • Incorrect average finger angle

Part 2: Pose Recognition Video Test Results Use video frames as a "database," to find ones matching an input pose

Part 2: Pose Recognition Still-Image Test Results Use a standard database

Publications • Segment-Based Hand Pose Estimation. In IEEE CRV 2005. • Hand Detection and Segmentation for Pose Recognition in Monochromatic Images. In progress. • Line Sketch. To be written.

Future Work • Develop and test bridge between segmentation and recognition algorithms • Feasible to convert finger candidate regions into framework of SBM • Results improved if palm center can be reliably located

Acknowledgements • Thesis Committee • Dr. Niels da Vitoria Lobo • Dr. Charles Hughes • Dr. Mubarak Shah • Dr. Huaxin You • Support • NSF REU Program

Detection, Segmentation, and Pose Recognition of Hands in Images

Detection, Segmentation, and Pose Recognition of Hands in Images

Presentation Transcript

Segmentation and Edge Detection

Segmentation and Region Detection

Spatial Business Detection and Recognition from Images

Text Detection and Character Recognition from Images

Human Pose Recognition

Pose Estimation and Segmentation of People in 3D Movies

Spatial Business Detection and Recognition from Images

Classification, Detection and Segmentation of Deformable Animals in Images

Real-Time Human Pose Recognition in Parts from Single Depth Images

Real-Time Human Pose Recognition in Parts from Single Depth Images

3d Pose Detection

TRAFFIC SIGN SEGMENTATION AND RECOGNITION IN SCENE IMAGES

Automated detection of faces in images

Spatial Business Detection and Recognition from Images

Face Recognition and Detection

HYBRID-BOOST LEARNING FOR MULTI-POSE FACE DETECTION AND FACIAL EXPRESSION RECOGNITION

Image Parsing: Unifying Segmentation, Detection and Recognition

Image Parsing : Unifying Segmentation, Detection, and Recognition

Iris Detection and Segmentation

Simultaneous Segmentation and 3D Pose Estimation of Humans or Detection + Segmentation = Tracking?

Pose Invariant Palmprint Recognition

Human Pose detection