Scene Interpretation in images and videos

Scene Interpretation in images and videos Chetan Jakkoju 200402009 CVIT

Scene interpretation Human can answer: • How many taxis ? • How many cars ? • What type of cars ? • How many buildings ? • How tall are buildings ? • What type of road junction ? But machine cannot!

Scene Interpretation Computer vision Robotics

Some aspects

Our interests(1) • Scene reconstruction ( planar scenes )

Our interests(2) • Scene recognition ( Outdoor roads )

Piecewise Planar Reconstruction using Convex OptimizationACCV 2009

Road Map • Introduction • Applications • Existing Solutions & Issues • New formulation using Convex Optimization

Introduction Output Input (Ri,ti) • Input: Set of images of a piecewise planar scene. • Output: 3D model (normal, perp. distance) and camera parameters (rotation, translation).

Applications • Robot navigation • Path planning • Inserting virtual objects • 3D reconstruction • A. Davison, I. Reid, N. Molton, and O. Stasse. MonoSLAM:Real-Time Single Camera SLAM.PAMI 2007 • R. Azuma, Y. Baillot, R. Behringer, S. Feiner, S. Julier, and B. MacIntyre. Recent advances in augmented reality. IEEE Computer Graphics and Applications, 21(6):34–47, 2001. • N. Snavely, S. M. Seitz, and R. Szeliski. Photo tourism: Exploring photo collections in 3d. SIGGRAPH 2006.

Homography • Simple scenario

Existing solutions • SVD based methods (Decompose Homography Matrix) • Faugeras & Zhang methods • Problem: Very much sensitive to noise • Bundle Adjustment methods • Problem: • Iterative non-linear method • huge time and space requirement apart from correctness.

Our Solution • New formulation in convex optimization framework. • Advantages • Better solution than Bundle adjustment. • Standard efficient solvers exist. (proposed in past 5 years)

Advances in Vision using Convex optimization • Optimization algorithms in Vision (MVG) • Optimal solutions exist for • H from point correspondences • Pose from Essential matrix • Convex optimization is matured enough! • F. Kahl. Multiple view geometry and the l-infinity norm. ICCV 2005 • R. Hartley and F. Kahl. Global optimization through searching rotation space and optimal estimation of the essential matrix. ICCV 2007 • S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, New York, NY, USA, 2004.

Basic formulation • H matrix • Highly non-linear. • Observation: Fixing pose parameters or plane parameters makes H linear H = [ d R – t nT]

Formulation • Given H, decompose it to (n,d) and (R,t). • Calculate • H != H’ in general • Goal: Vary (n,d) and (R,t) so that they close to H

Algorithm • Given H • Decompose H to R,t,n,d • While • Optimize F(n,d) (update n,d) • Optimize F(R,t) (update t) • end

Extensions • Extension to multiple views • All planes may not be visible in all views! • Sol: We use inter homographies • ( H23,H34,…)

Sample reconstructions Synthetic House showing “visual accuracy” Oxford model house Baity Hill

Summary • Presented convex optimization based algorithm for reconstruction • Applicable for videos. • Synthetic and real experiments show promising results • Much better optimization frameworks in future.

Part 2Monocular Terrain recognitionICPR 2010 & IROS 2010

Problem • Grass • Mud • Hard mud • Road • Other Classify

Applications • Autonomous robot navigation • Path planning • Advanced driver assistance systems • Obstacle @ 18mts • Obstacle @ 10mts

Existing solutions(1) ( In Robotics ) • Solve only sub-problem • Obstacle VS non-obstacle • Use multiple costly sensors • (lasers, ladars etc.,) • Though they perform well, they can’t “feel” the terrain surface.

Existing solutions(2) ( In Robotics ) • Good solution is to use IMU sensors • Advantages: • Solve much wider problem of recognizing various types of terrains. • Problems: • They can only recognize the terrain after they traverse. – “Short-sightedness” • IMU sensors are also costlier.

Ultimate goal • Solving the terrain recognition problem without using costly sensors • Just using single camera • Advantages: • Light weight • Low power • No “short sightedness” • Direct applications • in mini-robots • in Driver assistance systems.

Dataset collection • Camera attached on top of the car

Sample dataset • 25 videos each of 1 min involving different kind of scenarios

Base method • Prepare Training set and Testing set • In each image, 16x16 image block acts as training sample. • Extract feature-F from the block, and train a classifier-C.

Base method • Error rates on color features and base classifiers • Naïve Bayes (NB) • Artificial neural networks • K- Nearest neighbours • Support vector machines (linear) (SVM-L) • Support vector machines (Kernel) (SVM-K) • Random forest (RF)

Interesting observations of data • Relative position of different terrains • Eg: Probability of grass area near mud area is greater than that of the grass area near the road area. • Scale of texture varies majorly in vertical direction.

Proposed method • Previously we trained one classifier on whole image. • Training different classifier on different partition must “capture” the previous observations. • Note: Partitions increase in squares {22,32,42,…}

Experiment-1 • Always decreases the error by ~10%! • ~10 %

Experiment-2 • Error decreased from 25% to 15%! • (Using 4-8 classifier sets is desirable)

Experiment -3 (Smoothness test)

Other enhancementLabel Transfer • Track features from previous frames using optical flow • Transfer the labels • Result: ~45% of image is transferred

Cons • Memory less • Doesn’t perform well when appearance of terrain varies.

Adaptive algorithm • Track patches in the recent frames. • New training data

Adaptive algorithm

Experiment • Closed loop test • ~5% decrease in error ie,~20% error rate reduction • Road Run 1 Run 2

Demo

Summary • Presented fast-terrain classification method. • Extended the method to adapt online. • More video processing methods in future.

Conclusions and Future work • New techniques in scene reconstruction and scene recognition. • Reconstruction of piece wise planar scenes. • Main Advantages • All the planes may not be visible in all views. • We also add inter homographies in our framework. • Next we address Terrain recognition. • Own challenging dataset. • We conducted various empirical studies. • Proposed two algorithms • Partition based method & Adaptive algorithm • Conducted several experiments to validate them.

Conclusions and Future work • Quasi-convex objective functions to Convex objective functions. • Handling outliers • In partition based algorithm, one could replace the simple mode operator with weighted map. • Adaptive algorithm could be enhanced using state-of-the-art semi-supervised ML algorithms.

Publications • Visesh Chari, Anil Nelakanti, Chetan Jakkoju and C. V. Jawahar. ``Piecewise Planar Reconstruction using Convex Optimization.'' In proceedings of Asian Conference on Computer Vision (ACCV'09). • Chetan J., Madhava Krishna and C. V. Jawahar. ``Fast and Spatially-smooth Terrain Classiﬁcation using Monocular Camera.'' In proceedings of International Conference on Pattern Recognition. ( ICPR 2010 ) • Chetan J., Madhava Krishna and C. V. Jawahar. ``An Adaptive Outdoor Terrain Classification Methodology using Monocular Camera'' In proceedings of International Conference on Intelligent Robots and Systems. ( IROS 2010 )

Thank you  chetan@research.iiit.ac.in

Scene Interpretation in images and videos

Scene Interpretation in images and videos

Presentation Transcript

Mid-level Representations for Images and Videos

Search Images Videos Maps

INTEREST OF 3D IMAGES AND VIDEOS IN TEACHING OF RADIOGRAPHIC TECHNIQUES

Scene text recognition in images and video

TRAFFIC SIGN SEGMENTATION AND RECOGNITION IN SCENE IMAGES

Quantitative Interpretation of Uncalibrated Fundus Images

Integrating Recognition and Reconstruction for Cognitive Scene Interpretation

Life of Pi images for interpretation

Life of Pi images for interpretation

TEXT EXTRACTION FROM IMAGES AND VIDEOS

Clinical Procedure Videos and Images From N.E.J.M.

Learning sparse representations to restore, classify, and sense images and videos

Free Images and Videos

Greg Rolen Attorney | Videos & Images

you can easily shoot videos and click images

Inventory Images, Images & Royalty Free Videos

Stock Images and Stock Videos Market Analysis and Industry Report

Stock Photos, Images & Royalty Free Videos

How to Download Instagram Videos & Images

Mise en Scene (“Setting in Scene”) Look carefully at the images and write a

INTERPRETATION OF REMOTE SENSING IMAGES EXERCISE