720 likes | 1.07k Views
ROBOT VISION Lesson 10: Object Tracking and Visual Servoing Matthias Rüther. Contents. Object Tracking Appearance based tracking Kalman filtering Condensation algorithm Model based tracking Model fitting and tracking Visual Servoing Principle Servoing Types. Tracking. Tracking.
E N D
ROBOT VISIONLesson 10: Object Tracking andVisual ServoingMatthias Rüther
Contents • Object Tracking • Appearance based tracking • Kalman filtering • Condensation algorithm • Model based tracking • Model fitting and tracking • Visual Servoing • Principle • Servoing Types
Tracking Tracking
Definition of Tracking • Tracking: • Generate some conclusions about the motion of the scene, objects, or the camera, given a sequence of images. • Knowing this motion, predict where things are going to project in the next image, so that we don’t have so much work looking for them.
Tracking a Silhouette by Measuring Edge Positions • Observations are positions of edges along normals to tracked contour
Why not Wait and Process the Set of Images as a Batch? • E.g. in a car system, detecting and tracking pedestrians in real time is important. • Recursive methods require less computing
Implicit Assumptions of Tracking • Physical cameras do not move instantly from a viewpoint to another. • Objects do not teleport between places around the scene. • Relative position between camera and scene changes incrementally. • We can model motion
Related Fields • Signal Detection and Estimation • Radar technology
The Problem: Signal Estimation • We have a system with parameters • Scene structure, camera motion, automatic zoom • System state is unknown (“hidden”) • We have measurements • Components of stable “feature points” in the images. • “Observations”, projections of the state. • We want to recover the state components from the observations
Recursive Least Square Estimation • We don’t want to wait until all data have been collected to get an estimate of the depth. • We don’t want to reprocess old data when we make a new measurement. • Recursive method: data at step i are obtained from data at step i-1
Least Square Estimation of the State Vector of a Static System
Least Square Estimation of the State Vector of a Static System 2
Recursive Least Square Estimation for a Dynamic System (Kalman Filter)
Estimation when System Model is Nonlinear (Extended Kalman Filter)
Recursive Least Square Estimation for a Dynamic System (Kalman Filter)
Tracking as a Probabilistic Inference Problem • Find distributions for state vector aiand for measurement vector xi. Then we are able to compute the expectations âi and x^i. • • Simplifying assumptions (same as for HMM)
MODEL-BASED 3-D TRACKING IDEA: if motion is caused by known 3-D object, we can track 3-D motion parameters, not just individual features! ADVANTAGES: - low dimensionality (3 rotations, 3 translations independent of number of features tracked) - mutually constrained motion instead of independently moving points LIMITATIONS: - 6 params only with rigid objects! Not articulated, not deformable. - assumes 3-D model known a priori
Example Algorithm [Wunsch,Hirzinger IEEE RA 1997] SKETCH OF ALGORITHM: 0. Initialize 3-D pose R0, t0 (rot, transl) 1. Extract features from image It 2. Match img features with features of 3-D model positioned at Rt-1, tt-1 3. Evaluate global error metric in 3-D space (notice, not in image space) 4. Estimate Rt, tt aligning img and model features 5. Next frame and go to 1.
Some Details FEATURES: for instance using image edges with orient. and offset d(and sx, sy camera scale factors), then is the normal of the 3-D plane through the img edge. 3-D plane through img edge q p Corresponding model edge ERROR METRIC:in 3-D space for efficiency (no back-projection): orthogonality of n and model edge
Some Details MINIMISATION: using, say, 3 types of features: Trick 1: Approximating R with differential rotations: All E terms can be linearized, a linear system obtained from the quadratic minimization,and a solution computed in closed form: e.g., for edges,
Some Details ... where The resulting linear system A [t ] = b is (trick 2) applied iteratively at each time instant to reduce errors; a few iterations should suffice for small frame-to-frame displacements. NOTICE ASSUMPTIONS MADE: - rigid object - model known a priori - small frame-to-frame displacements - img-model feature correspondences known (if small displacements, by min distance)
Problems with Tracking • Initial detection • If it is too slow we will never catch up • If it is fast, why not do detection at every frame? • Even if raw detection can be done in real time, tracking saves processing cycles compared to raw detection. • The CPU has other things to do. • Detection is needed again if you lose tracking • Most vision tracking prototypes use initial detection done by hand
Visual Servoing • Vision System operates in a closed control loop. • Better Accuracy than „Look and Move“ systems Figures from S.Hutchinson: A Tutorial on Visual Servo Control
Visual Servoing • Example: Maintaining relative Object Position Figures from P. Wunsch and G. Hirzinger. Real-Time Visual Tracking of 3-D Objects with Dynamic Handling of Occlusion
Visual Servoing • Camera Configurations: End-Effector Mounted Fixed Figures from S.Hutchinson: A Tutorial on Visual Servo Control
Visual Servoing • Servoing Architectures Figures from S.Hutchinson: A Tutorial on Visual Servo Control
Visual Servoing • Position-based and Image Based control • Position based: • Alignment in target coordinate system • The 3D structure of the target is rconstructed • The end-effector is tracked • Sensitive to calibration errors • Sensitive to reconstruction errors • Image based: • Alignment in image coordinates • No explicit reconstruction necessary • Insensitive to calibration errors • Only special problems solvable • Depends on initial pose • Depends on selected features End-effector target Image of end effector Image of target
Visual Servoing • EOL and ECL control • EOL: endpoint open-loop; only the target is observed by the camera • ECL: endpoint closed-loop; target as well as end-effector are observed by the camera EOL ECL
Visual Servoing • Position Based Algorithm: • Estimation of relative pose • Computation of error between current pose and target pose • Movement of robot • Example: point alignment p1 p2
p1m p2m d Visual Servoing • Position based point alignment • Goal: bring e to 0 by moving p1 e = |p2m – p1m| u = k*(p2m – p1m) • pxm is subject to the following measurement errors: sensor position, sensor calibration, sensor measurement error • pxm is independent of the following errors: end effector position, target position
Visual Servoing • Image based point alignment • Goal: bring e to 0 by moving p1 e = |u1m – v1m| + |u2m – v2m| • uxm, vxm is subject only to sensor measurement error • uxm, vxm is independent of the following measurement errors: sensor position, end effector position, sensor calibration, target position p1 p2 u1 v1 v2 u2 d1 d2 c1 c2
Visual Servoing • Example Laparoscopy Figures from A.Krupa: Autonomous 3-D Positioning of SurgicalInstruments in Robotized LaparoscopicSurgery Using VisualServoing
Visual Servoing • Example Laparoscopy Figures from A.Krupa: Autonomous 3-D Positioning of SurgicalInstruments in Robotized LaparoscopicSurgery Using VisualServoing
Tracking using CONDENSATION CONditional DENSity PropagATION M. Isard and A. Blake, CONDENSATION – Conditional density propagation for visual tracking, Int. J. Computer Vision 29(1), 1998, pp. 4-28.
Goal • Model-based visual tracking in dense clutter at near video frame rates
Approach • Probabilistic framework for tracking objects such as curves in clutter using an iterative sampling algorithm. • Model motion and shape of target • Top-down approach • Simulation instead of analytic solution
Probabilistic Framework • Object dynamics form a temporal Markov chain • Observations, zt , are independent (mutually and w.r.t process) • Use Bayes’ rule
Notation X State vector, e.g., curve’s position and orientation Z Measurement vector, e.g., image edge locations p(X) Prior probability of state vector; summarizes prior domain knowledge, e.g., by independent measurements p(Z) Probability of measuring Z; fixed for any given image p(Z | X) Probability of measuring Z given that the state is X; compares image to expectation based on state p(X | Z) Probability of X given that measurement Z has occurred; called state posterior
Tracking as Estimation • Compute state posterior, p(X|Z), and select next state to be the one that maximizes this (Maximum a Posteriori (MAP) estimate) • Measurements are complex and noisy, so posterior cannot be evaluated in closed form • Particle filter (iterative sampling) idea: • Stochastically approximate the state posterior with a set of N weighted particles, (s, ), where s is a sample state and is its weight • Use Bayes’ rule to compute p(X|Z)
Factored Sampling • Generate a set of samples that approximates the posterior p(X|Z) • Sample set s={s(1), …, s(N)} generated from p(X); each sample has a weight (“probability”)