320 likes | 428 Views
Structural Learning from Iconic Representations. Herman M. Gomes* hmg@dsc.ufpb.br. Robert B. Fisher rbf@dai.ed.ac.uk. Institute of Perception, Action and Behaviour Division of Informatics Edinburgh University. * Supported by CNPq Brazil and DSC/COPIN/UFPB. Overview. Introduction
E N D
Structural Learning from Iconic Representations Herman M. Gomes* hmg@dsc.ufpb.br Robert B. Fisher rbf@dai.ed.ac.uk Institute of Perception, Action and Behaviour Division of Informatics Edinburgh University * Supported by CNPq Brazil and DSC/COPIN/UFPB.
Overview • Introduction • Learning primitive object models • Learning model relationships • Case study • Conclusions and future work SBIA/IBERAMIA 2000, Atibaia SP
Introduction • Traditional object recognition research • Geometric, symbolic or structure based recognition: CAD based 2D and 3D vision and 3D object recognition • Property, vector or feature based recognition: specific feature vectors, multiple filtering, global descriptors for shape, texture and colour, amongst others • Iconic or image based recognition: direct use of images, either complying with the traditional sensor architecture or using alternative representations • This work fits in the intersection of the above areas SBIA/IBERAMIA 2000, Atibaia SP
Introduction • Question: Would it be possible to learn rigid geometric models from 2D image evidence (iconic object models) acquired from a sequence of scenes? SBIA/IBERAMIA 2000, Atibaia SP
Introduction • Mechanisms from Biology • Foveated vision: retina-like image representation (log-polar) has useful properties • Visual attention: fixation gives insights where object features (or components) are likely to be found • Primal sketch: provides more compact representations for image data and cues for an attention mechanism SBIA/IBERAMIA 2000, Atibaia SP
Introduction • System’s architecture Model base Attention Map Generic Scenes Update attention Primitive models Model relationships Feature planes Extract primal sketch planes Foveate Image Cluster objects SBIA/IBERAMIA 2000, Atibaia SP
Introduction • Image representation • Gaussian receptive field function • Local contrast normalisation for estimating original reflectance information • Primal sketch features (edges, bars, blobs and ends) learned and extracted using a neural network approach • Log-polar SBIA/IBERAMIA 2000, Atibaia SP
Introduction • Retinal and log-polar images SBIA/IBERAMIA 2000, Atibaia SP
Introduction • Some of the extracted features SBIA/IBERAMIA 2000, Atibaia SP
Learning primitive object models • Definition Primitive iconic model: set of regions or object instances that are similar to each other, organised into a geometric model and specified by means of the relative scales, orientations, positions and similarity scores for each pair of image regions SBIA/IBERAMIA 2000, Atibaia SP
Learning primitive object models 5 1 11 12 4 6 9 2 3 7 8 10 13 Model X: {1,2,3,4,5,6,9,10,11} Model Y: {7,8,12,13} Y 7 8 12 13 Relative scales and orientations 7 (1.0, 0) (1.0, 180) (0.8, 270) (0.8, 90) 8 (1.0, 0) (0.8, 90) (0.8, 270) 12 (1.0, 0) (1.0, 180) 13 (1.0, 0) SBIA/IBERAMIA 2000, Atibaia SP
Algorithm for each scene image S ido for each foveation point P ij on the scene do obtain the object feature f ij at the position P ij if the model base is empty then create a new model and store f ij on it else generate a set of scaled and rotated variations of f ij find the model Ft that gives the highest similarity score Cmax between its internal object features and one of the variations ifCmax > threshold store f ij in Ft for all f kl in Ftstore the similarity scores Sm(f ij, f kl) and the relative scales rS(f ij, f kl) and the relative orientations rO(f ij, f kl) elsecreate a new model and store f ij on it SBIA/IBERAMIA 2000, Atibaia SP
Learning model relationships • Initial assumptions • The recognition of consistent geometric relations allows the inference of larger structured object models • At the moment we consider only 2D rigid body transformations SBIA/IBERAMIA 2000, Atibaia SP
Learning model relationships • Overview of the solution • A graph is built from the output of the previous algorithm • Vertices represent instances of an image neighbourhood found in the scenes • Edges represent a relationships between two neighbourhoods • Intra and inter-model relationships are inferred by means of the cliques found in the graph SBIA/IBERAMIA 2000, Atibaia SP
Learning model relationships • Vertices • How to distinguish between object feature instances of a same type that are structurally linked together and those that are not? • How to make the correspondence between sets of feature instances that appear at a same position in all objects? • We build hypothesis in the vertices by taking the full combinatorial set of object feature instances and applying an evaluation function that tells how good a hypothesis is SBIA/IBERAMIA 2000, Atibaia SP
Learning model relationships • Vertices • Defined as the Cartesian product of the sets of object feature instances of a same model class found in each of the images in the sequence alternatively: • Missing features: a wild-card ‘*’ is added to f ibefore computing the Cartesian product SBIA/IBERAMIA 2000, Atibaia SP
Learning model relationships • Vertices • Ranking: function of the similarity between its internal object instances • Pruning: allow only K *’s per vertex during the vertex creation process (K << N) • We define: Sm(*,*) = Sm(*,f j ) = Sm(f i ,*) = 1 SBIA/IBERAMIA 2000, Atibaia SP
Learning model relationships • Edges • An edge e(a,b) relates two compatible vertices • Two vertices a and b are defined as compatible if: 1. For each pair of feature instances (a i, a j ) in a, which are related by a given scale and orientation R=(rS(a i, a j ), rO(a i, a j )), there exists another pair (b i, b j ) in bthat has internal components related through a similar relative scale and orientation; and 2. Each pair of feature instance co-ordinates (Pai, Pbi ) and (Paj, Pbj ) taken from the same vertex positions roughly define the same vector angle Aand length D,Q=(A, D), when taking into account the feature's relative scales and orientations. SBIA/IBERAMIA 2000, Atibaia SP
Learning model relationships • Understanding edge creation Rija= (rS(a i, a j ), rO(a i, a j )) Rijb= (rS(b i, b j ), rO(b i, b j )) Qiab= A(Pai, Pbi ), D(Pai, Pbi )) Qjab= A(Paj, Pbj ), D(Paj, Pbj )) a=(a1, a2,..., aN) b=(b1, b2,..., bN) Rija ai aj bN Qiab Qjab ... ... ... bi bj aN Rijb Scene i Scene j Scene N SBIA/IBERAMIA 2000, Atibaia SP
Learning model relationships • Edge ranking rS, rO ai aj bN A,D A,D ... ... ... bi bj aN rS, rO Scene i Scene j Scene N SBIA/IBERAMIA 2000, Atibaia SP
Learning model relationships • Comparing relative scales rS(a i, a j ) ai aj bN ... ... ... bi bj aN rS(b i, b j ) Scene i Scene j Scene N SBIA/IBERAMIA 2000, Atibaia SP
Learning model relationships • Comparing relative orientations rO(a i, a j ) ai aj bN ... ... ... bi bj aN rO(b i, b j ) Scene i Scene j Scene N SBIA/IBERAMIA 2000, Atibaia SP
Learning model relationships • Comparing vector angles ai aj A(ai, bi ) bN A(a j, b j ) ... D(ai, bi ) ... ... D(a j, b j) bi bj aN Scene i Scene j Scene N SBIA/IBERAMIA 2000, Atibaia SP
Learning model relationships • Comparing vector lengths SBIA/IBERAMIA 2000, Atibaia SP
Learning model relationships • Edge pruning • Eliminate edges that link pairs of vertices with at least one common instance at the same position in the vertex list • Discard edges below a given threshold SBIA/IBERAMIA 2000, Atibaia SP
Learning model relationships • Cliques • Sets of vertices that are maximally connected • A standard algorithm is used to find cliques in the graph G=(V,E) • Ranking: • Thresholding SBIA/IBERAMIA 2000, Atibaia SP
Case study • Initial setup • Scenes created from synthetically generated variations on a same set of real input images • Interest points selected manually • Similarity function defined as the cross-correlation of grey level and primal sketch planes SBIA/IBERAMIA 2000, Atibaia SP
Primitive object feature models found SBIA/IBERAMIA 2000, Atibaia SP
Cliques found SBIA/IBERAMIA 2000, Atibaia SP
Conclusions and future work • We provided some evidence that structured models can be learned in the context of an iconic vision system by using a graph-based representation and algorithm • The relative positions of object features are recorded and, by using the feature’s relative scales and orientations, possible relationships can be inferred SBIA/IBERAMIA 2000, Atibaia SP
Conclusions and future work • There are still some issues that require further research: • Deciding for the best ranking functions and how they affect the final results • Learning primitive object models is computationally expensive • Vertex creation is exponential to the number of images • Designing more complex case studies is currently under development SBIA/IBERAMIA 2000, Atibaia SP