470 likes | 479 Views
This paper discusses scene interpretation in computer vision and robotics, including scene reconstruction and recognition. It also presents a convex optimization-based algorithm for 3D reconstruction and monocular terrain recognition using a single camera.
E N D
Scene Interpretation in images and videos Chetan Jakkoju 200402009 CVIT
Scene interpretation Human can answer: • How many taxis ? • How many cars ? • What type of cars ? • How many buildings ? • How tall are buildings ? • What type of road junction ? But machine cannot!
Scene Interpretation Computer vision Robotics
Our interests(1) • Scene reconstruction ( planar scenes )
Our interests(2) • Scene recognition ( Outdoor roads )
Piecewise Planar Reconstruction using Convex OptimizationACCV 2009
Road Map • Introduction • Applications • Existing Solutions & Issues • New formulation using Convex Optimization
Introduction Output Input (Ri,ti) • Input: Set of images of a piecewise planar scene. • Output: 3D model (normal, perp. distance) and camera parameters (rotation, translation).
Applications • Robot navigation • Path planning • Inserting virtual objects • 3D reconstruction • A. Davison, I. Reid, N. Molton, and O. Stasse. MonoSLAM:Real-Time Single Camera SLAM.PAMI 2007 • R. Azuma, Y. Baillot, R. Behringer, S. Feiner, S. Julier, and B. MacIntyre. Recent advances in augmented reality. IEEE Computer Graphics and Applications, 21(6):34–47, 2001. • N. Snavely, S. M. Seitz, and R. Szeliski. Photo tourism: Exploring photo collections in 3d. SIGGRAPH 2006.
Homography • Simple scenario
Existing solutions • SVD based methods (Decompose Homography Matrix) • Faugeras & Zhang methods • Problem: Very much sensitive to noise • Bundle Adjustment methods • Problem: • Iterative non-linear method • huge time and space requirement apart from correctness.
Our Solution • New formulation in convex optimization framework. • Advantages • Better solution than Bundle adjustment. • Standard efficient solvers exist. (proposed in past 5 years)
Advances in Vision using Convex optimization • Optimization algorithms in Vision (MVG) • Optimal solutions exist for • H from point correspondences • Pose from Essential matrix • Convex optimization is matured enough! • F. Kahl. Multiple view geometry and the l-infinity norm. ICCV 2005 • R. Hartley and F. Kahl. Global optimization through searching rotation space and optimal estimation of the essential matrix. ICCV 2007 • S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, New York, NY, USA, 2004.
Basic formulation • H matrix • Highly non-linear. • Observation: Fixing pose parameters or plane parameters makes H linear H = [ d R – t nT]
Formulation • Given H, decompose it to (n,d) and (R,t). • Calculate • H != H’ in general • Goal: Vary (n,d) and (R,t) so that they close to H
Algorithm • Given H • Decompose H to R,t,n,d • While • Optimize F(n,d) (update n,d) • Optimize F(R,t) (update t) • end
Extensions • Extension to multiple views • All planes may not be visible in all views! • Sol: We use inter homographies • ( H23,H34,…)
Sample reconstructions Synthetic House showing “visual accuracy” Oxford model house Baity Hill
Summary • Presented convex optimization based algorithm for reconstruction • Applicable for videos. • Synthetic and real experiments show promising results • Much better optimization frameworks in future.
Problem • Grass • Mud • Hard mud • Road • Other Classify
Applications • Autonomous robot navigation • Path planning • Advanced driver assistance systems • Obstacle @ 18mts • Obstacle @ 10mts
Existing solutions(1) ( In Robotics ) • Solve only sub-problem • Obstacle VS non-obstacle • Use multiple costly sensors • (lasers, ladars etc.,) • Though they perform well, they can’t “feel” the terrain surface.
Existing solutions(2) ( In Robotics ) • Good solution is to use IMU sensors • Advantages: • Solve much wider problem of recognizing various types of terrains. • Problems: • They can only recognize the terrain after they traverse. – “Short-sightedness” • IMU sensors are also costlier.
Ultimate goal • Solving the terrain recognition problem without using costly sensors • Just using single camera • Advantages: • Light weight • Low power • No “short sightedness” • Direct applications • in mini-robots • in Driver assistance systems.
Dataset collection • Camera attached on top of the car
Sample dataset • 25 videos each of 1 min involving different kind of scenarios
Base method • Prepare Training set and Testing set • In each image, 16x16 image block acts as training sample. • Extract feature-F from the block, and train a classifier-C.
Base method • Error rates on color features and base classifiers • Naïve Bayes (NB) • Artificial neural networks • K- Nearest neighbours • Support vector machines (linear) (SVM-L) • Support vector machines (Kernel) (SVM-K) • Random forest (RF)
Interesting observations of data • Relative position of different terrains • Eg: Probability of grass area near mud area is greater than that of the grass area near the road area. • Scale of texture varies majorly in vertical direction.
Proposed method • Previously we trained one classifier on whole image. • Training different classifier on different partition must “capture” the previous observations. • Note: Partitions increase in squares {22,32,42,…}
Experiment-1 • Always decreases the error by ~10%! • ~10 %
Experiment-2 • Error decreased from 25% to 15%! • (Using 4-8 classifier sets is desirable)
Other enhancementLabel Transfer • Track features from previous frames using optical flow • Transfer the labels • Result: ~45% of image is transferred
Cons • Memory less • Doesn’t perform well when appearance of terrain varies.
Adaptive algorithm • Track patches in the recent frames. • New training data
Experiment • Closed loop test • ~5% decrease in error ie,~20% error rate reduction • Road Run 1 Run 2
Summary • Presented fast-terrain classification method. • Extended the method to adapt online. • More video processing methods in future.
Conclusions and Future work • New techniques in scene reconstruction and scene recognition. • Reconstruction of piece wise planar scenes. • Main Advantages • All the planes may not be visible in all views. • We also add inter homographies in our framework. • Next we address Terrain recognition. • Own challenging dataset. • We conducted various empirical studies. • Proposed two algorithms • Partition based method & Adaptive algorithm • Conducted several experiments to validate them.
Conclusions and Future work • Quasi-convex objective functions to Convex objective functions. • Handling outliers • In partition based algorithm, one could replace the simple mode operator with weighted map. • Adaptive algorithm could be enhanced using state-of-the-art semi-supervised ML algorithms.
Publications • Visesh Chari, Anil Nelakanti, Chetan Jakkoju and C. V. Jawahar. ``Piecewise Planar Reconstruction using Convex Optimization.'' In proceedings of Asian Conference on Computer Vision (ACCV'09). • Chetan J., Madhava Krishna and C. V. Jawahar. ``Fast and Spatially-smooth Terrain Classification using Monocular Camera.'' In proceedings of International Conference on Pattern Recognition. ( ICPR 2010 ) • Chetan J., Madhava Krishna and C. V. Jawahar. ``An Adaptive Outdoor Terrain Classification Methodology using Monocular Camera'' In proceedings of International Conference on Intelligent Robots and Systems. ( IROS 2010 )
Thank you chetan@research.iiit.ac.in