Misplaced Pages

Lucas–Kanade method

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
(Redirected from Lucas–Kanade Optical Flow Method) Computer vision technique for optical flow estimation

In computer vision, the Lucas–Kanade method is a widely used differential method for optical flow estimation developed by Bruce D. Lucas and Takeo Kanade. It assumes that the flow is essentially constant in a local neighbourhood of the pixel under consideration, and solves the basic optical flow equations for all the pixels in that neighbourhood, by the least squares criterion.

By combining information from several nearby pixels, the Lucas–Kanade method can often resolve the inherent ambiguity of the optical flow equation. It is also less sensitive to image noise than point-wise methods. On the other hand, since it is a purely local method, it cannot provide flow information in the interior of uniform regions of the image.

Concept

The Lucas–Kanade method assumes that the displacement of the image contents between two nearby instants (frames) is small and approximately constant within a neighborhood of the point p {\displaystyle p} under consideration. Thus the optical flow equation can be assumed to hold for all pixels within a window centered at p {\displaystyle p} . Namely, the local image flow (velocity) vector ( V x , V y ) {\displaystyle (V_{x},V_{y})} must satisfy

I x ( q 1 ) V x + I y ( q 1 ) V y = I t ( q 1 ) I x ( q 2 ) V x + I y ( q 2 ) V y = I t ( q 2 )   I x ( q n ) V x + I y ( q n ) V y = I t ( q n ) {\displaystyle {\begin{aligned}I_{x}(q_{1})V_{x}+I_{y}(q_{1})V_{y}&=-I_{t}(q_{1})\\I_{x}(q_{2})V_{x}+I_{y}(q_{2})V_{y}&=-I_{t}(q_{2})\\&\;\ \vdots \\I_{x}(q_{n})V_{x}+I_{y}(q_{n})V_{y}&=-I_{t}(q_{n})\end{aligned}}}

where q 1 , q 2 , , q n {\displaystyle q_{1},q_{2},\dots ,q_{n}} are the pixels inside the window, and I x ( q i ) , I y ( q i ) , I t ( q i ) {\displaystyle I_{x}(q_{i}),I_{y}(q_{i}),I_{t}(q_{i})} are the partial derivatives of the image I {\displaystyle I} with respect to position x , y {\displaystyle x,y} and time t {\displaystyle t} , evaluated at the point q i {\displaystyle q_{i}} and at the current time.

These equations can be written in matrix form A v = b {\displaystyle Av=b} , where A = [ I x ( q 1 ) I y ( q 1 ) I x ( q 2 ) I y ( q 2 ) I x ( q n ) I y ( q n ) ] v = [ V x V y ] b = [ I t ( q 1 ) I t ( q 2 ) I t ( q n ) ] {\displaystyle A={\begin{bmatrix}I_{x}(q_{1})&I_{y}(q_{1})\\I_{x}(q_{2})&I_{y}(q_{2})\\\vdots &\vdots \\I_{x}(q_{n})&I_{y}(q_{n})\end{bmatrix}}\quad \quad \quad v={\begin{bmatrix}V_{x}\\V_{y}\end{bmatrix}}\quad \quad \quad b={\begin{bmatrix}-I_{t}(q_{1})\\-I_{t}(q_{2})\\\vdots \\-I_{t}(q_{n})\end{bmatrix}}}

This system has more equations than unknowns and thus it is usually over-determined. The Lucas–Kanade method obtains a compromise solution by the least squares principle. Namely, it solves the 2 × 2 {\displaystyle 2\times 2} system A T A v = A T b {\displaystyle A^{T}Av=A^{T}b} or v = ( A T A ) 1 A T b {\displaystyle \mathrm {v} =(A^{T}A)^{-1}A^{T}b} where A T {\displaystyle A^{T}} is the transpose of matrix A {\displaystyle A} . That is, it computes [ V x V y ] = [ i I x ( q i ) 2 i I x ( q i ) I y ( q i ) i I y ( q i ) I x ( q i ) i I y ( q i ) 2 ] 1 [ i I x ( q i ) I t ( q i ) i I y ( q i ) I t ( q i ) ] {\displaystyle {\begin{bmatrix}V_{x}\\V_{y}\end{bmatrix}}={\begin{bmatrix}\sum _{i}I_{x}(q_{i})^{2}&\sum _{i}I_{x}(q_{i})I_{y}(q_{i})\\\sum _{i}I_{y}(q_{i})I_{x}(q_{i})&\sum _{i}I_{y}(q_{i})^{2}\end{bmatrix}}^{-1}{\begin{bmatrix}-\sum _{i}I_{x}(q_{i})I_{t}(q_{i})\\-\sum _{i}I_{y}(q_{i})I_{t}(q_{i})\end{bmatrix}}} where the central matrix in the equation is an Inverse matrix. The sums are running from i = 1 {\displaystyle i=1} to n {\displaystyle n} .

The matrix A T A {\displaystyle A^{T}A} is often called the structure tensor of the image at the point p {\displaystyle p} .

Weighted window

The plain least squares solution above gives the same importance to all n {\displaystyle n} pixels q i {\displaystyle q_{i}} in the window. In practice it is usually better to give more weight to the pixels that are closer to the central pixel p {\displaystyle p} . For that, one uses the weighted version of the least squares equation, A T W A v = A T W b {\displaystyle A^{T}WAv=A^{T}Wb} or v = ( A T W A ) 1 A T W b {\displaystyle \mathrm {v} =(A^{T}WA)^{-1}A^{T}Wb} where W {\displaystyle W} is an n × n {\displaystyle n\times n} diagonal matrix containing the weights W i i = w i {\displaystyle W_{ii}=w_{i}} to be assigned to the equation of pixel q i {\displaystyle q_{i}} . That is, it computes [ V x V y ] = [ i w i I x ( q i ) 2 i w i I x ( q i ) I y ( q i ) i w i I x ( q i ) I y ( q i ) i w i I y ( q i ) 2 ] 1 [ i w i I x ( q i ) I t ( q i ) i w i I y ( q i ) I t ( q i ) ] {\displaystyle {\begin{bmatrix}V_{x}\\V_{y}\end{bmatrix}}={\begin{bmatrix}\sum _{i}w_{i}I_{x}(q_{i})^{2}&\sum _{i}w_{i}I_{x}(q_{i})I_{y}(q_{i})\\\sum _{i}w_{i}I_{x}(q_{i})I_{y}(q_{i})&\sum _{i}w_{i}I_{y}(q_{i})^{2}\end{bmatrix}}^{-1}{\begin{bmatrix}-\sum _{i}w_{i}I_{x}(q_{i})I_{t}(q_{i})\\-\sum _{i}w_{i}I_{y}(q_{i})I_{t}(q_{i})\end{bmatrix}}}

The weight w i {\displaystyle w_{i}} is usually set to a Gaussian function of the distance between q i {\displaystyle q_{i}} and p {\displaystyle p} .

Use conditions and techniques

In order for equation A T A v = A T b {\displaystyle A^{T}Av=A^{T}b} to be solvable, A T A {\displaystyle A^{T}A} should be invertible, or A T A {\displaystyle A^{T}A} 's eigenvalues satisfy λ 1 λ 2 > 0 {\displaystyle \lambda _{1}\geq \lambda _{2}>0} . To avoid noise issue, usually λ 2 {\displaystyle \lambda _{2}} is required to not be too small. Also, if λ 1 / λ 2 {\displaystyle \lambda _{1}/\lambda _{2}} is too large, this means that the point p {\displaystyle p} is on an edge, and this method suffers from the aperture problem. So for this method to work properly, the condition is that λ 1 {\displaystyle \lambda _{1}} and λ 2 {\displaystyle \lambda _{2}} are large enough and have similar magnitude. This condition is also the one for corner detection. This observation shows that one can easily tell which pixel is suitable for the Lucas–Kanade method to work on by inspecting a single image.

One main assumption for this method is that the motion is small (less than 1 pixel between two images for example). If the motion is large and violates this assumption, one technique is to reduce the resolution of images first and then apply the Lucas–Kanade method.

In order to achieve motion tracking with this method, the flow vector can be iteratively applied and recalculated, until some threshold near zero is reached, at which point it can be assumed that the image windows are very close in similarity. By doing this to each successive tracking window, the point can be tracked throughout several images in a sequence, until it is either obscured or goes out of frame.

Improvements and extensions

The least-squares approach implicitly assumes that the errors in the image data have a Gaussian distribution with zero mean. If one expects the window to contain a certain percentage of "outliers" (grossly wrong data values, that do not follow the "ordinary" Gaussian error distribution), one may use statistical analysis to detect them, and reduce their weight accordingly.

The Lucas–Kanade method per se can be used only when the image flow vector V x , V y {\displaystyle V_{x},V_{y}} between the two frames is small enough for the differential equation of the optical flow to hold, which is often less than the pixel spacing. When the flow vector may exceed this limit, such as in stereo matching or warped document registration, the Lucas–Kanade method may still be used to refine some coarse estimate of the same, obtained by other means; for example, by extrapolating the flow vectors computed for previous frames, or by running the Lucas–Kanade algorithm on reduced-scale versions of the images. Indeed, the latter method is the basis of the popular Kanade–Lucas–Tomasi (KLT) feature matching algorithm.

A similar technique can be used to compute differential affine deformations of the image contents.

See also

References

  1. ^ B. D. Lucas and T. Kanade (1981), An iterative image registration technique with an application to stereo vision. Proceedings of Imaging Understanding Workshop, pages 121--130
  2. Bruce D. Lucas (1984) Generalized Image Matching by the Method of Differences (doctoral dissertation)
  3. J. Y. Bouguet, (2001) . Pyramidal implementation of the affine lucas kanade feature tracker description of the algorithm. Intel Corporation, 5.

External links

Categories: