810 likes | 1k Views
電腦視覺在微創手術之應用. 吳佳祥 義守大學 生物醫學工程學系. Computer Vision. What is Computer Vision?. Input: images or video Output: description of the world But also: measuring, classifying, interpreting visual information. Low-Level or “Early” Vision. Considers local properties of an image. “There’s an edge!”.
E N D
電腦視覺在微創手術之應用 吳佳祥 義守大學 生物醫學工程學系
What is Computer Vision? • Input: images or video • Output: description of the world • But also: measuring, classifying, interpreting visual information
Low-Level or “Early” Vision • Considers local properties of an image “There’s an edge!”
Mid-Level Vision • Grouping and segmentation “There’s an object and a background!”
High-Level Vision • Recognition “It’s a chair!”
Cognitive Psychology Signal Processing Artificial Intelligence Computer Graphics Pattern Analysis Metrology Vision and Other Fields Computer Vision
3D Reconstruction Tomasi+Kanade Forsyth et al. Phigin et al. Debevec,Taylor,Malik
Endoscopy system video equipment lens light source endoscope tip of endoscope
Minimally invasive surgery • Disadvantages • Limited visibility • Lack of physical feedback • Indirect hand-eye coordination • Poor depth perception. • Advantages • Reducing infection and scarring • Shortening hospital stay and recovery time • Less post-operative discomfort • Reduced costs
Minimally invasive surgery • Disadvantages • Limited visibility • Lack of physical feedback • Indirect hand-eye coordination • Poor depth perception. • Advantages • Reducing infection and scarring • Shortening hospital stay and recovery time • Less post-operative discomfort • Reduced costs Solution: 3D anatomical structure
Related works Lee et al. ’99
Related works Lee et al. ’99 Sinha et al. ’05 Hayashibe et al. ’02
Related works Lee et al. ’99 Sinha et al. ’05 Hayashibe et al. ’02 Mourgues et al. ’01
Related works Lee et al. ’99 Stoyanov et al. ’05 Sinha et al. ’05 Hayashibe et al. ’02 Mourgues et al. ’01
Related works Lee et al. Stoyanov et al. Burschka et al. Sinha et al. Hayashibe et al. Mourgues et al.
Problem formulation endoscopic video surgical instrument endoscope
3D positioning/tracking technology • 3D positioning/ tracking technology has been widely used in advanced minimally invasive surgical procedures Image-guided surgery Robotic surgery
Idea Endoscopic video 3D positioning technology ( Tracked surgical instrument) • Combine image and geometric constraints from positioning devices to reconstruct the structure
point i xi1 camera xi2 frame 1 xij frame j xin frame n Feature trajectories Feature trajectory for point i Feature trajectory for m feature points. (m=5) frame 2 frame 1 frame n
Generating feature trajectories Distortion correction Dealing with low-contrast images Dealing with un-even illumination Feature extraction Building correspondencesacross frames Outlier rejection
Generating feature trajectories • For low-contrast images • Color histogram equalization • For un-even illumination • Intensity normalization Distortion correction Dealing with low-contrast images Dealing with un-even illumination Feature extraction Building correspondencesacross frames Outlier rejection
Generating feature trajectories • Extracting corner feature points Distortion correction Dealing with low-contrast images Dealing with un-even illumination Feature extraction Building correspondencesacross frames Outlier rejection e1, e2: eigen values of Cw If min(e1, e2)> te, the point is a corner
Generating feature trajectories • KLT algorithm begins to track the corresponding point • Unknown displacement is determined by minimizing • In practice, we solve for Distortion correction Dealing with low-contrast images Dealing with un-even illumination Feature extraction Building correspondencesacross frames Outlier rejection
pk vij vij vjk pj pj vjk vjk pk vjk vij vij pi pi Generating feature trajectories • The typical movement of an endoscope is smooth and slow, so we assume the motion of feature points between two adjacent frames is smooth and small Distortion correction Dealing with low-contrast images Dealing with un-even illumination Feature extraction Building correspondencesacross frames Outlier rejection reliable unureliable
Performance evaluation • Two types of motion were simulated: linear and polyline • We set step size= 1 ( ) and 3 ( ) to represent small and large linear (or polyline) motion, yielding four kinds of synthetic videos: small linear, small polyline, large linear, and large polyline • Error measurements • number of tracked feature trajectories (NTFT) • the accuracy of tracking, defined as the root-mean-square (RMS) error between the actual and the computed feature trajectories
Computation time • Our method needs around 2.7 to 4.5 seconds of computation, depending on the number of detected features and the video size • Building correspondences requires the most time among all components; thus, further improvement could be focused on this step
Perspective projection where known intrinsic parameters
Paraperspective projection where • First-order approximation of perspective projection
Scaled measurement matrix W = MS feature trajectories object shape camera motion
Approaching perspective reconstruction in an iterative manner Pj perspective projection first iteration second iteration Paraperspective projection P0 optical axis center of projection last iteration Image plane • At the first iteration, the algorithm considers the true perspective projection of Pi. • At the last iteration, the image point positions were modified such that they fit the paraperspective projections.
Step 1. Create scaled measurement matrix W depends on • ij =0
Step 2. Singular value decomposition W = MS • Separate motion and shape • SVD of WW=UDVT • Enforcing rank constraint: • only consider the three largest singular values feature trajectories object shape camera motion The rank of M is three, if n 2 and the camera motion is neither pure translation nor pure rotation around the optical axis. In addition, the rank of S is also three, if the 3-D points are not coplanar and the number of points m 3. Therefore, the rank of W, which is the product of two rank-3 matrices, is three.
Step 3. Perturbation-based Euclidean reconstruction • The approximated motion and shape are not unique • Solution: find the best H Identity matrix
Perturbation-based Euclidean reconstruction • Euclidean reconstruction computes the optimal H by first finding a linear solution of Q = HHT • Cholesky decomposition ( or Jacobi decomposition) or over Q. • These decompositions only work for a symmetric positive definite matrix
Perturbation-based Euclidean reconstruction • Conventional approach: • Gradient-based, non-linear optimization is prone to be trapped in local minima when a poor initial estimate is employed. • We perturb Q by using a 33 diagonal matrixE' such thatQ+E'is positive definite, where E' is nonnegative, bounded, and should be zero in case of an already positive definite matrix • Perturbation approach: • No initial estimate of the solution • Quickly compute the desired decomposition.
Perturbation-based Euclidean reconstruction • Based on modified Cholesky decomposition [R.B. Schnabel et al. SIAM J. Optim. ’99;SIAM J. Sci. Statist. Comput.’90] • First, standard Cholesky decomposition is performed to avoid modifying any already positive definite matrix. • The jth column of the decomposed lower triangular matrix H is determined by , , where for i = j+1, ..., 3. At the (j+1)th iteration, Qj+1 is obtained by
Perturbation-based Euclidean reconstruction • To ensureHjjis the root of a positive value and to avoid over-perturbation, the jth iteration of the standard Cholesky decomposition is carried out only when • each diagonal element of is less than - • mini>jQii< -αj • αj, where 0 < 1 ( = 0.1 in our experiments), = (meps)1/3, = maxj , and meps is machine precision • If any of the conditions is not satisfied, we switch to the modified Cholesky decomposition. In the modified Cholesky decomposition, Hjj is determined by , and Qj+1 is computed by
Perturbation-based Euclidean reconstruction • The maximum diagonal element in Qj is pivoted to the top left position by interchanging its row and column with the pivot row and column, respectively. • The pivoting process is represented by a permutation matrix P, which is formed by reordering the columns of an identity matrix. • Pre- and post-multiplication of Q by PT and P respectively reorder the rows and columns of Q. • Therefore, PTQP+E = LLTwhere L is a lower triangular matrix, and E is given permutated.
Perturbation-based Euclidean reconstruction • With E' denoting the un-permutated version of E, we haveE = PTE'P. • Substituting this equation to the left-hand side of, we have PT(Q+E')P = LLT. • Because a permutation matrix is orthogonal, Q+E' = PLLTPT. • In this way, Q+E' =PL(PL)Twhich implies H = PL
Step 4. Input in-situ constraints • In-situ constraints • real 3D coordinates as detected by the (in our implemented example) tracked surgical instrument. • applied only to detected feature points.