Learning to Segment from Diverse Data

Learning to Segment from Diverse Data M. Pawan Kumar Haithem Turki Dan Preston Daphne Koller

Learn accurate parameters for a segmentation model Aim • Segmentation without generic foreground or background classes • Train using both strongly and weakly supervised data

“Strong” Supervision “Weak” Supervision “Car” Data in Vision “One hand tied behind the back…. “

Data for Vision “Strong” Supervision “Weak” Supervision  “Car”

Specific foreground classes, generic background class Types of Data PASCAL VOC Segmentation Datasets

Specific background classes, generic foreground class Types of Data Stanford Background Dataset

Bounding boxes for objects Types of Data PASCAL VOC Detection Datasets Thousands of freely available images Current methods only use small, controlled datasets

Image-level labels Types of Data ImageNet, Caltech … Thousands of freely available images “Car”

Noisy data from web search Types of Data Google Image, Flickr, Picasa ….. Millions of freely available images

Outline • Region-based Segmentation Model • Problem Formulation • Inference • Results

Region-based Segmentation Model Regions Pixels Object Models 

Region features Detection features Pairwise contrast Pairwise context Problem Formulation Treat missing information as latent variables Image x Annotation y Complete Annotation (y,h) Joint Feature Vector (x,y,h)

Problem Formulation Treat missing information as latent variables Image x Annotation y Complete Annotation (y,h) Latent Structural SVM (y*,h*) = argmax wT (x,y,h) Trained by minimizing overlap loss ∆

Self-Paced Learning hi = maxhH wtT(xi,yi,h) Update Update wt+1 by solving a biconvex problem min ||w||2 + C∑i vii - K∑i vi wT(xi,yi,hi) - wT(xi,y,h) ≥ (yi, y, h) - i Start with an initial estimate w0 Annotation Consistent Inference Loss Augmented Inference Kumar, Packer and Koller, 2010

Generic Classes DICTIONARY OF REGIONS D MERGE AND INTERSECT WITH SEGMENTS TO FORM PUTATIVE REGIONS Current Regions Over-Segmentations ITERATE UNTIL CONVERGENCE SELECT REGIONS min Ty s.t. y  SELECT(D) Kumar and Koller, 2010

Generic Classes Binary yr(0) = 1 iff r is not selected Binary yr(1) = 1 iff r is selected miny ∑r(i)yr(i) + ∑rs(i,j)yrs(i,j) Minimize the energy s.t. yr(0) + yr(1) = 1 Assign one label to r from L yrs(i,0) + yrs(i,1) = yr(i) Ensure yrs(i,j) = yr(i)ys(j) yrs(0,j) + yrs(1,j) = ys(j) ∑r “covers” u yr(1) = 1 Each super-pixel is covered by exactly one selected region yr(i), yrs(i,j)  {0,1} Binary variables

Generic Classes DICTIONARY OF REGIONS D MERGE AND INTERSECT WITH SEGMENTS TO FORM PUTATIVE REGIONS Simultaneous region selection and labeling Current Regions Over-Segmentations ITERATE UNTIL CONVERGENCE SELECT REGIONS min Ty s.t. y  SELECT(D) ∆new ≤ ∆prev Kumar and Koller, 2010

Examples Iteration 3 Iteration 6 Iteration 1

Bounding Boxes Each row and each column of bounding box is covered min Ty y  SELECT(D) ∆new ≤ ∆prev +  Ka (1-za) za  {0,1} za ≤ r “covers” a yr(c)

Examples Iteration 2 Iteration 4 Iteration 1

Image-Level Labels Image must contain the specified object min Ty y  SELECT(D) ∆new ≤ ∆prev +  K (1-z) z  {0,1} z≤  yr(c)

PASCAL VOC 2009 Stanford Background Dataset + Generic background class 20 foreground classes Generic foreground class 7 background classes

PASCAL VOC 2009 Stanford Background Dataset + Train - 1274 images Validation - 225 images Test - 750 images Train - 572 images Validation - 53 images Test - 90 images Baseline: Closed-loop learning (CLL), Gould et al., 2009

Results PASCAL VOC 2009 CLL - 24.7% LSVM - 26.9% Improvement over CLL SBD CLL - 53.1% LSVM - 54.3% Improvement over CLL

PASCAL VOC 2009 + 2010 Stanford Background Dataset + Train - 1274 images Validation - 225 images Test - 750 images Bounding Boxes - 1564 images Train - 572 images Validation - 53 images Test - 90 images

Results PASCAL VOC 2009 CLL - 24.7% LSVM - 26.9% BOX - 28.3% Improvement over CLL SBD CLL - 53.1% LSVM - 54.3% BOX - 54.8% Improvement over CLL

PASCAL VOC 2009 + 2010 Stanford Background Dataset + Train - 1274 images Validation - 225 images Test - 750 images Bounding Boxes - 1564 images Train - 572 images Validation - 53 images Test - 90 images + 1000 image-level labels (ImageNet)

Results PASCAL VOC 2009 CLL - 24.7% LSVM - 26.9% BOX - 28.3% LABEL - 28.8% Improvement over CLL SBD CLL - 53.1% LSVM - 54.3% BOX - 54.8% LABEL - 55.3% Improvement over CLL

Examples

Failure Modes

Examples

Specific foreground classes, generic background class Types of Data PASCAL VOC Segmentation Datasets

Specific background classes, generic foreground class Types of Data Stanford Background Dataset

Bounding boxes for objects Types of Data PASCAL VOC Detection Datasets Thousands of freely available images

Image-level labels Types of Data ImageNet, Caltech … Thousands of freely available images “Car”

Noisy data from web search Types of Data Google Image, Flickr, Picasa ….. Millions of freely available images

Two Problems The “Noise” Problem Self-Paced Learning The “Size” Problem Self-Paced Learning

Questions?

Learning to Segment from Diverse Data

Learning to Segment from Diverse Data

Presentation Transcript

Fusing Data from Diverse Sources to Characterize Batch Reactions

LEARNING FROM DATA

Learning to Segment with Diverse Data

Learning Specific-Class Segmentation from Diverse Data

Predictive Learning from Data

Learning From the Data

Statistical Learning (From data to distributions)

Statistical Learning (From data to distributions)

Diverse cohorts and barriers to Learning

Predictive Learning from Data

Transfer Learning Segment

Learning From Data

Move your Data Anywhere: Getting Data to and From Diverse Systems

Statistical Learning (From data to distributions)

Building biological networks from diverse genomic data

Predictive Learning from Data

Predictive Learning from Data

Predictive Learning from Data

Predictive Learning from Data

Predictive Learning from Data