480 likes | 666 Views
SHAPE AND SHAPE MATCHING METRICS May-05. Peter Stiller, Ph.D. Greg Arnold, Ph.D. Sensors Directorate Air Force Research Laboratory AFRL/SNAT, Bldg 620; 2241 Avionics Circle WPAFB OH 45433-7302; (937) 255-1115x4388. Contacts. Overview. Motivation / Problem space Invariants & Shape
E N D
SHAPE AND SHAPE MATCHING METRICSMay-05 Peter Stiller, Ph.D. Greg Arnold, Ph.D. Sensors Directorate Air Force Research Laboratory AFRL/SNAT, Bldg 620; 2241 Avionics Circle WPAFB OH 45433-7302; (937) 255-1115x4388
Overview • Motivation / Problem space • Invariants & Shape • Object-Image Relations • Object-Image Metrics • Frontiers & Summary
??? YOU! ??? ??? ??? ??? ??? ??? ??? ??? ??? ??? ??? ??? ??? ??? ??? ??? ??? ??? ??? ??? ??? ??? ??? The Combat IDentification Problem • Lots of objects to detect & identify, • Friends look like foes, • No two targets look alike • CC&D, damage, aging, clutter • Degrades CID performance • Tough to extrapolate results beyond tested data “Combat Identification is the process of attaining an accurate characterization of detected objects in the joint battlespace to the extent that high confidence, timely application of military options and weapons resources can occur.”
Hummel @ ATR Theory MURI WS • ATR: The problems have changed • Probably want to use a different name • Recognition theory? • Recognition problems remain • Intent of humans is paramount • Distinguish sensor development from exploitation • Exploitation is the interpretation of sensor data • For favorable sensors, the exploitation is easier • But, exploitation requires “analysis” of multi-sensor data • Far side ideas welcome • Too much ATR work is on the fringes
Same problem via O-I Duality Example Retrieval Problems Objective 1: To rapidly query a large database of 3-D model objects in an attempt to identify likely candidates for an object appearing in an image. Objective 2: To rapidly query a large database of 2-D images to find images likely to contain a particular 3-D object. Applications: Combat Identification Need: Metrics for efficient searching Targets in Urban Areas Aircraft Identification
Database Database Database Database Database Database SAM Database Database Database Database Database The Promise of Object-Image Metrics Today: Tomorrow: Database Database Start Database Database Group 1 Group 2 Obj. A Obj. B Obj. C Obj. D • Metrics enable highly efficient database search via triangle inequality: • Faster Search • Smaller Databases (Opt. Quantized) • Metrics lead to an ‘absolute’ noise theory => Reliable & Predictable Unknowns Rejection Exhaustive search of measured or synthetic data exemplars => prohibitive system & computational costs; poor unknown rejection
ATR Operating Conditions (OCs) • Backgrounds • Urban, Rural, Desert • Terrain Slopes • Cultural Features • Man-made Confusers • Environment • Weather • Atmosphere • Illumination • Obscurants • Target • Type • Articulation • Stores • Configuration • CC&D • Sensor • Resolution • Looks • Image Quality OC’s: Everything that changes the sensor response. Most OC’s are continuous • Sensor / Target • Depression Angle • Aspect Angle • Squint Angle • Target / Background • Adjacency • Foreground / Background • Obscuration
EOC’s impose Invariants Parameter Quantized Bins Object type 20 Object aspect 72 Depression angle 5 Articulation (1 DoF) 36 Configuration (4 binary) 16 Obscuration 400 Correspondence 20 Netting 5 Total Online Hypotheses 1.6 x 1011 Exhaustive Enumeration is not tractable => Invariants
Problem Simplification • Having said all that, let’s examine a problem for which we have some intuition • 4 or 5 points undergoing rotation, translation, and maybe scale and skew • 1-D, 2-D, and 3-D • Understand the projection from world to sensor • Advertisement: Institute of Mathematics & its Application (IMA) has “Thematic Year of Imaging”
What is Shape? • Pose and scale invariant, coordinate independent characterization of an arrangement of features. • Residual geometric relationships that remain between features after “mod-ing out” the transformation group action. • Captured by a “shape space” where each distinct configuration of features (up to transformation) is represented by a single point.
Beyond Invariants Invariants + Projection Object-Image Relations
Generalized Weak Perspective • Projection model applicable to optical images(pinhole camera) • Approximates full perspective for objects in ‘far field’ • Affine transformations on 3-space, and in the image plane (2-space) • Denoted GWP
Affine Transformations • In 3D (Rotate, Scale, Skew | Translate) (3-D Point)
GWP Projection3D to 2D Projection Image Object
Image 1 Image 2 Object-Image Relation Motivation Object 1 • Image 1 is not equivalent to Image 2 (in 2-D) • Object 1 is not equivalent to Object 2 (in 3-D) Object 2
Object - Image Relations Concept “The relation between objects and images expressed independent of the camera parameters and transformation group” (1) Write out the camera equations (geo or photo) (2) Eliminate the group & camera parameters (3) Recognize the result as a relation between the object and image invariants. But pure elimination is VERY difficult even for polynomials.
Weak PerspectiveObject - Image Relations WEAK PERSPECTIVE • (Generalized) • Parallel things remain parallel • The object size is 1/10 the distance from the camera • (Standard Position Method)
P4 P5 P3 P1 P2 q4 q5 q3 q1 q2 Weak Perspective Camera 3-D Model Pi={xi,yi,zi} N-points (3N DOF) Rotate,Translate,Scale,Shear (12 Constraints) 3N-12 Absolute Invariants 2-D Image qi={ui,vi} N-points (2N DOF) Rotate,Translate,Scale,Shear (6 Constraints) 2N-6 Absolute Invariants Camera Model N-points (2N DOF) Union 2-D & 3-D (8 Constraints) 2N-8 relations Need 5 corresponded points (minimum)
P4 P5 P3 P1 P2 3-D Invariants 3-D Model Pi={xi,yi,zi,1} 5-points GL3+Translation (12 Constraints) 3N-12 Absolute Invariants Invariant is a function of the Ratio of Determinants: A useful standard position is:
q4 q5 q3 q1 q2 2-D Invariants 2-D Image qi={ui,vi,1} 5-points GL2+Translation (6 Constraints) 2N-6 Absolute Invariants Invariant is a function of the Ratio of Determinants: A useful standard position is:
Object - Image RelationGeneralized Weak Perspective Camera (2-D Standard Position) = (Camera Transform) (3-D Standard Position) Eliminate camera transform parameters: The camera transforms the first 4 object point to image points, the remaining points satisfy the object - image relation iff:
Object-Image Relation Abstraction All objects that could have produced the image. Object- Image Relations All images of the object.
GWP Shape Spaces • The shape spaces in the GWP case are Grassmann manifolds • In 3D • Gr(n-4,H) or dually the Schubert cycle of 4-planes in Gr(4,n) which contain (1,….,1) • Manifold has dimension 3n-12 • In 2D • Gr(n-3,H) or dually the Schubert cycle of 3-planes in Gr(3,n) which contain (1,….,1) • Manifold has dimension 2n-6 H is the subspace of n-space orthogonal to the vector (1,…,1)
Why • We associate to our object data, viewed as a linear transformation from n-space to 4-space, its null space K of dimension n-4. • Likewise to our image data in 2D we associate the null space L of dimension n-3.
Global Shape Coordinates • Better than local invariants • Come from an isometric embedding of the shape space in either Euclidean space or projective space. • Matching expressed in these coordinates will gracefully degrade
Example in GWP • 3D, n = 5 feature points • Global shape coordinates are the Plucker coordinates (or dual Plucker coordinates) of the 4xn object data matrix or the 3xn image data matrix.
Global Object-Image Relations • General • If and only If conditions • Overdetermined set of equations • GWP • To match, K must be contained in L (iff) • This incidence condition can be expressed in terms of the global shape coordinates • For n=5, 10 (non-independent) relations that look like: [1234][125]-[1235][124]+[1245][123] Locally only 2 of the 10 are independent, because the locus V of matching pairs (object shape, image shape) in the 7 dimensional product space XxY has dimension 5, codimension 2.
Beyond Object-Image Relations Object-Image Relations + Matching Object-Image Metrics
Why Metrics? • We intuitively know that if we want to measure something we need a metric… ATR is no different. • How far apart are these points? • The triangle inequality provides efficient match searching • Reliable & predictable Unknowns rejection • Theoretical performance prediction
Measure the distance from the image to each shape prototype! u X* u Group 3 X* X* Group 1 Group 2 Search the group iff the distance to the prototype is less than the sum of the max intragroup distance and noise threshold. The Triangle Inequality Advantage u: image, x*: prototype object, xk: object from group Measure the distance from the image to each shape object? Shape Space
Xk Using the Triangle Inequality Equivalent Grouping Decision u: image, x*: prototype object, xk: object from group Reject beyond Thresholdnoise u Search the group iff the distance to the prototype is less than the sum of the max intragroup distance and noise threshold. Shape Space X*
What are Shape and Distance? • Shape: What is left after translation & rotation are removed (more generally, the group) • This is the (Partial) Procrustes definition of distance • R represents rotation and T represents translation • Procrustes normalizes the size of the objects
New Metrics? Any ol’ metric just won’t do… • Invariant to translation & rotation of 3-D object (+ more) • Invariant to the camera projection (+ discretization) • This leads to the concept of Object-Image Relations (O-IR’s) • Incomplete • O-IR’s are only surrogate metrics • =0 iff the object and image features are consistent • Object-Image Metrics satisfy all the metric properties • Shape Space is NOT Euclidean! • There is some evidence that human similarity perception is not always metric
Metrics on the Shape Spaces • How to compare objects to images! • We want a natural shape matching metric • Invariant to transformations of the 3D or 2D data, • e.g. Rotations, translations, or scale of the object or image • Generalize Weak Perspective • We use the natural Riemannian metric on the Grassmannian to measure distances between object shapes and image shapes • This involves the so called principal angles between subspaces and is easily computed from the original data matrices via QR decomposition and SVD.
Object-Image Metrics Two ways to compute an “object to image” distance 1. Object Space Compute the minimum distance in object space from the given object to the set of all objects capable of producing the given image 2. Image Space Compute the minimum distance from the given image to the set of all images produced by that object
Object-Image Metrics & Duality Object Shape Space Image Shape Space xu: all objects that could have produced the image. Object- Image Relations ux: all images of the object. Duality Theorem: Matching can (in principle)be performed in either object or image space without loss of performance !
Duality • Theorem - with suitable normalization These metrics are the same! In the GWP case this distance turns out to be the distance between two subspaces of different dimension defined again by using principal angles.
Image Geodesics • 2 random images • Geodesic between them • Not Linear • Not the projection of a line • Not even coplanar • Geodesics on the this cone have the same length as the calculated image distance!
Orthographic Shape Space3 Points in 1-D & 2-D • 3 points modulo translation, rotation, reflection yields… • 1-D: Surface of a 30o cone w/ axis along {1,1,1} • 2-D: Interior of the cone • {0,0,0} object @ origin • {a,a,b} objects partition cone • Scale: lines through origin • Geodesics on the this cone have the same length as the calculated image distance! Rotation on the ‘wrong’ side aboutthe centroid rotates the cone (isotropy condition).
Object-Image Relations (1) • Fix an Object • Set of Images it can produce • Always circumscribe the cone • Not conic sections! • Equilateral triangle produces a slightly smaller circle • ‘Image’ produces line to origin
Object-Image Relations (2) • Fix an Image • Set of objects that could produce the given image • “Bent over cone” • Touches along line through origin and image • Eventually converges to the cone surface • Large objects must be nearly collinear to produce the image
The Frontier • Point Correspondence • Matching unlabeled (unordered) point sets • From Points to Surfaces (Discrete to Continuous) • Points as Surface Sample • No 2 samples are ever exactly same • Non-uniform sampling of the surface • and…
Intrinsic Separability • How many different shapes can I hope to identify? • Shape space as a unit volume • Epsilon balls defined by metric • Noise balls generated by noise • 5 points in 3D (Generalized Weak Perspective) • Epsilon in [0, 0.73] (max radius of ball) • P(Random Shape in Epsilon Ball)=1.37*Epsilon • Example: Epsilon=0.01 • P(Random Shape in Epsilon Ball)=0.014 • Requires as least 73 balls to cover shape space • Could be more or less efficient coverings • Separability on a gross level
Epsilon Balls & Noise Analysis • The set of all shapes of distance 1 from the given shape (image) • The image + Gaussian IID noise added to each image point location(std. dev. 0.5) • Still working the analytic model of noise in the shape space
Unknowns Rejection • Noise Analysis is basis for thresholds • Close-Confuser results are transferable
Summary • Object Recognition • EOC’s imply need for invariants • Projection implies need for Object-Image Relations • Matching implies need for Object-Image Metric • Results • Understanding geometry • Object-Image Metrics • Orthographic, Weak, Full Perspective • 3-D world to 1-D, 2-D, and 3-D projection • Metric Duality • Robust, Reliable, and predictable performance