440 likes | 459 Views
This course provides an overview of topics such as human vision, camera models, image transformations, camera calibration, pixel-based image analysis, contrast description, and object recognition.
E N D
Formation et Analyse d’ImagesSession 10 Daniela Hall 5 December 2005
Course Overview • Session 1 (19/09/05) • Overview • Human vision • Homogenous coordinates • Camera models • Session 2 (26/09/05) • Tensor notation • Image transformations • Homography computation • Session 3 (3/10/05) • Camera calibration • Reflection models • Color spaces • Session 4 (10/10/05) • Pixel based image analysis • 17/10/05 course is replaced by Modelisation surfacique
Course overview • Session 5 + 6 (24/10/05) 9:45 – 12:45 • Contrast description • Hough transform • Session 7 (7/11/05) • Kalman filter • Session 8 (14/11/05) • Tracking of regions, pixels, and lines • Session 9 (21/11/05) • Gaussian filter operators • Session 10 (5/12/05) • Scale Space • Session 11 (12/12/05) • Stereo vision • Epipolar geometry • Session 12 (16/01/06): exercises and questions
Session Overview • Scale space • Application: object recognition • Application: logo detection in sports videos
Scalability of Gaussian derivatives • Gaussian derivatives can be scaled due to the explicit scale parameter sigma.
Example: derivatives at several scales σ=1 σ=4 σ=2 σ=16 σ=8
Scale of image features • It would be interesting to know which are the interesting scales of an object. • Interesting scales: scales at which an important structure is visible. • In the example: the interesting scale (if we are interested in the people) is somewhere between 2.0 and 4.0. • Scale space theory answers these questions and provides a method to describe local features invariant to scale.
Scale of image features • Do these image features correspond?
Scale space representation • L is the scale space representation of I(x,y). • L is obtained by smoothing I with Gaussian kernel G(s). • G(s) is the natural choice for building up a scale space.
Scale Space • Scale space representation x s y I(x,y) x L(x,y,s)
Intrinsic feature scale [Lindeberg98] • Measurement procedure: • At a given spatial point (x,y) we sample the scale direction by applying a normalized derivative of different scales (scale signature f). • Different operators can be used for • Laplacian • Gradient norm
Intrinsic feature scale [Lindeberg98] • Intrinsic scale: • The scale at which this normalized derivative assumes a maximum marks a feature containing interesting structure.
Scale invariant feature description • Given a feature with intrinsic scale • The description components are normalized for
Scale invariant feature description • Compute intrinsic feature scale for each imagette. • Project imagette on scale normalised Gaussian derivatives. S(x,y,sk)=a0G(i,j,sk)+a1G’(i,j,sk)+a2G’’(i,j,sk)+... • Extract scale invariant signature (coefficients and intrinsic scale sk) v(S(x,y,sk))=(a0,a1,a2,...,sk) • Store all signatures with image id and position.
Scale invariant feature description • Do these image features correspond? Yes, because the signatures are close in feature space. • dist(v(S(x,y,σ)),v(S(sx,sy,sσ))) <eps S(x,y,σ) S(sx,sy,sσ)
Session Overview • Scale space • Application: object recognition • Application: logo detection in sports videos
Object recognition • Traning data: a set of images containing objects. • Test data: same objects in different configurations . • Goal: given a small number of measurements, recognize which objects are in the image.
Learning phase: sensing the world describing the measurements storage of measurement representation in appropriate data structure Recognition phase: sensing the world describing the measurements interpreting the measurements compare with learned measurements decide Recognition systems
Learning phase: sensing the world with a digital color camera discriminant feature description (texture & color) storage in indexing structure Recognition phase: sensing the world describing the measurements interpreting the measurements search possible matching candidates in indexing structure recognition by vote or hypothesis verification My recognition system
Choice of the feature space • The feature space should have few dimensions. • The description vector (set of Gaussian derivatives) should be chosen such that they capture the information necessary for recognition • Feature vector (v,θ,σ); • Global information (id of source image, pos within source image) • The description vector should be invariant to the transformations of the input data. • In case of viewpoint independent recognition, we require invariance to position, orientation, size, and affine transformations. • The choice are scale normalised Gaussian receptive fields oriented to the dominant direction of the imagette.
Feature description vector • describe local texture up to order 2 and local chrominance up to order 1 • description invariant to position, scale and orientation • similarity of features is proportional to distance in feature description space
Building the model Storage in appropriate indexing structure for efficient retrieval Select representative training images Scale invariant feature description
Recognition phase • Sensing the world • Describing the measurements • Interpreting the measurements • search possible matching candidates in the learned data • recognition by vote or hypothesis verification
Search matching candidates • Matching candidates are found by evaluating the distance in feature space. • All elements within a sphere with radius e from the query vector are matching candidates. • Tree structures allow efficient nearest neigbor search.
Recognize by vote • Distance evaluation produces a list of matching candidates. For each measurement we know from which object it came from (information stored during learning). • Every matching candidate votes for the object it came from. • The object with the higest number of votes is the winner (recognition result).
Experiment • Image indexing: 8 objects on uniform background. • Voting: • Select image features • Search matching candidates • Image for which votes the majority of the matching candidates gets a vote • Result: 99% recognition by votes after 9 different measurements
Open problems • Changes in viewpoint • Recognition under natural conditions (changes in lighting, viewing angle, …) • Capacity of tree structure is limited
Session Overview • Scale space • Application: object recognition • Application: logo detection in sports videos
Detecting Logos in Video • The video producer “sells” air-time of corporate publicity images (logos). • How can he measure the value of his product?
Detecting logos in video • Video publicity statistics • How many times does logo appear? • For each appearance • What was the duration? • How big was it? • How visible was it? • How near is it to the focus of attention? • Current estimates are obtained off-line by hand • This process is slow, costly and inaccurate.
The challenge of logo detection • Challenge: • Natural outdoor scenes • Unconstrained view angles • Rapid pan-tilt-zoom of cameras • Occlusions • Similar color/texture • Unknown target position, deformations • Goal: real time collection of logo statistics from sports videos
Model acquisition Learns model data for detection and identification • Data acquisiton: • Hand label example frames • Semi automatic labeling (label first frame, track following frames) • Learning phase: • color histograms for detection and tracking • Gaussian derivative histograms for identification • Problem: training data must capture operating conditions
Logo identification • Typical Example
Exercise • How can you count the number of flowers in the image and determine their scale?