200 likes | 210 Views
Rotational Rectification Network (R2N): Enabling Pedestrian Detection for Mobile Vision. Xinshuo Weng 1 , Shangxuan Wu 1 , Fares Beainy 2 , Kris M. Kitani 1 1 Carnegie Mellon University, 2 Volvo Construction Equipment WACV 2018, Lake Tahoe. Pedestrian Detection. Pedestrian Detection.
E N D
Rotational Rectification Network (R2N): Enabling Pedestrian Detection for Mobile Vision Xinshuo Weng1, Shangxuan Wu1, Fares Beainy2, Kris M. Kitani1 1Carnegie Mellon University, 2Volvo Construction Equipment WACV 2018, Lake Tahoe
Pedestrian Detection • Results on Caltech dataset Zhang et al. Is Faster R-CNN Doing Well for Pedestrian Detection? ECCV, 2016.
Arbitrary-Oriented Pedestrian Detection • Random failure cases on Caltech dataset.
Why is it interesting? Imagine the cases: • Mobile phones
Why is it interesting? Imagine the cases: • Mobile phones • UAVs/drones
Why is it interesting? Imagine the cases: • Mobile phones • UAVs/drones • Construction vehicles on a rugged terrain
Why is it interesting? Imagine the cases: • Mobile phones • UAVs/drones • Construction vehicles on a rugged terrain • Wearable cameras • ….
Why is it interesting? Imagine the cases: • Mobile phones • UAVs/drones • Construction vehicles on a rugged terrain • Wearable cameras • …. Camera orientation can be very flexible with respect to the ground in the real world.
Modelling Rotation Invariance/Equivariance Rotating the inputs • Data augmentation • TI-Pooling [Laptev et al CVPR’ 16] • …. • Cons: • Low efficiency • More parameters Rotating the filters Changing sampling grids
Modelling Rotation Invariance/Equivariance Rotating the inputs • Data augmentation • TI-Pooling [Laptev et al, CVPR’ 16] • …. • Cons: • Low efficiency • More parameters Rotating the filters • RotEqNet [Marcos et al, ICCV’ 17] • ORNs [Zhou et al, CVPR’ 17] • …. • Cons: • Approximated rotations • Memory issues Changing sampling grids
Modelling Rotation Invariance/Equivariance Rotating the inputs • Data augmentation • TI-Pooling [Laptev et al, CVPR’ 16] • …. • Cons: • Low efficiency • More parameters Rotating the filters • RotEqNet [Marcos et al, ICCV’ 17] • ORNs [Zhou et al, CVPR’ 17] • …. • Cons: • Approximated rotations • Memory issues Changing sampling grids • Spatial Transformer [Jaderberg et al, NIPS’ 15] • Deformable ConvNets [Dai et al, ICCV’ 17] • GPPooling (Ours) • ….
Global Polar Pooling (GPPooling) Inputs Activations
GPPooling vs Pooling GPPooling Pooling Noh et al. Learning Deconvolution Network for Semantic Segmentation? ICCV, 2015.
What is Rotational Rectification Network (R2N)? R2N = Rotation Estimation Module (including GPPooling) + Spatial Transformer
Take Home Messages • GPPooling can be used to model global rotation equivariance/invariance in general CNNs. • R2N is easy to plug in and improves the performance on oriented detection without bells and whistles.