220 likes | 420 Views
Contents. 1. Background: Bag-of-words technique2. Implicit shape model3. Bottom-up proposal Top-Down verification4. Dealing with overlapping hypotheses5. Tracking in dynamic scene6. Beyond independent voting. Bag of Words. An image represented as a set of patterns (detected by some interest p
E N D
1. Voting objects, their shapes and more---- A partial review of recent work by Bastian Leibe et al. Si, Zhangzhang
2. Contents 1. Background: Bag-of-words technique
2. Implicit shape model
3. Bottom-up proposal + Top-Down verification
4. Dealing with overlapping hypotheses
5. Tracking in dynamic scene
6. Beyond independent voting
3. Bag of Words An image represented as a set of patterns (detected by some interest point detector, and represented by SIFT descriptors).
Inference is done by independent voting by patterns observed in an image.
4. Bag of Words Different “types” (clusters) of patterns have different votes (weights).
Obtain pattern types by vector quantization (clustering).
Obtain weights and threshold by support vector machines.
5. Implicit Shape Model Use patches themselves (scaled down to 25*25 pixels) as patterns.
A white-box way of voting:
A patch activates several codebook entries (clusters).
Each patch has one vote to cast. (fair?) And:
The hypothesis that gets the most votes wins. Cluster-i casts x% of votes for category o, if x% of patches in cluster-i are from category o.
Cluster-i casts x% of votes for category o, if x% of patches in cluster-i are from category o.
6. Implicit Shape Model Problem: un-informative patterns cast noisy votes which overwhelm informative votes.
Coping with voting noise:
Agglomerate clustering to ensure patches within a cluster are very similar. ? Clutter patches form smaller, purer clusters.
Filter out noisy patches using segmentation.
7. Implicit Shape Model Vote not only for object categories, but also
1. object center, scale
2. pixel-wise figure/ground labels (segmentation) for each object hypothesis
8. Implicit Shape Model: Model details P(c_i | e) is uniform on activated codebook entriesP(c_i | e) is uniform on activated codebook entries
9. Hypotheses as maxima in voting space
10. Patch distance, Cluster distance
11. Summary: Voting as Bottom-up Proposals Voting in Implicit Shape Model: a straight-forward estimation of . X is data (patches in the above context), Y is label (category, position, segmentation).
Space of X is large: vector quantization.
Restrictions:
Voting is independent.
No hierarchy is present.
12. Top-down verification ISM: patches are too local, no global consistency.
Given a silhouette, we want to match it to image in order to verify the hypothesis.
13. Dealing with overlapping hypotheses
14. Dealing with overlapping hypotheses
15. Tracking in Dynamic Scene It is a system integrating Structure-from-Motion, object detection, tracking components.
16. 2D Hypotheses using ISM detectors
17. Ground plane constraints
18. 3D hypothesis
19. Adding temporal information
20. Trajectory hypotheses
21. Beyond independent voting
22. Take-home messages Combine different cues (3D geometry + 2D appearance, static image + motion, bottom-up proposal + top-down verification, shape + color)
Codebook entries should be compact to reduce noisy votes (proposals).
23. References