In the field of image processing various methods have been devised to detect motion of objects and/or identifying those objects based upon an analysis of the detected motion. The methods analyze images of the field of view under consideration taken at different points in time. The term image is used herein in the broadest sense as a two-dimensional intensity field. An image may be a set of data representative of the intensity at pixels in the image.
Motion detection is important in the area of object recognition, for example, for security systems such as parking lot monitoring. Monitoring problems can involve both motion detection and object recognition. Notion detection of objects in a field of view relates to distinguishing moving objects, such as vehicles, from surrounding objects. For example, an automobile may be moving through a crowded parking lot making visual detection from a single image very difficult. In order to detect the presence of the moving automobile, several images of the area, or field of view, could be taken from a sensor mounted on a pole, on a helicopter, or such. After analyzing several images taken of the same area at successive points in time, motion can often be detected.
Object recognition relates to not only detecting the motion of a potential object of interest, but also attempting to discriminate between the moving object, as for example a car, versus some other type of vehicle which or person who also may be in the area. The nature of the motion detected may be indicative of the type of object moving in the area of concern. Typically, images need to be taken at a relatively closer range in order to recognize an object than they do to merely detect a moving object.
Motion detection in which the relative motion between a sensor and the objects in a field of view is determined also can be used to determine the three-dimensional structure of an environment through various published techniques. Passive navigation, and obstacle avoidance, can also utilize motion detection techniques. Relative motion detection also can be useful in adaptive cruise control, for example. The analysis of the relative motion between a moving vehicle, such as an automobile, and its surroundings can be used to alter and/or correct the speed and direction of the vehicle.
One common method of attempting to detect motion involves "subtracting" images of a particular area of view taken at different points in time. The data representation of one image is subtracted from the data representation of the other image to detect whether or not any motion has taken place during the time between the two images. Pixels of the combined images where no motion has taken place during the time interval will tend to have zero values. At areas in the field of view where motion has taken place, and the image intensity has therefore changed between the images, non-zero values will result, indicative of the change. A graphic plot of the non-zero data points at the pixel level after the subtraction has taken place will illustrate the areas in the field of view where motion has been detected.
One major problem associated with using the "subtraction" approach to motion detection is that the method is sensitive to sensor motion. For example, in outdoor applications during the time elapsing between two images practically all surfaces in the field of view will have moved relative to the sensor, thereby producing a change in intensity when the two images are "subtracted". The result is that motion can be detected, but it is extremely difficult to discern whether the detected motion is attributable to moving objects or to the moving sensor. As a result, the "subtraction" approach is very limited. Another limitation is due to the effect of electrical noise on the data output of the sensor which tends to distort the images.
In addition to simplify determining the presence or absence of motion, it is possible to quantify the motion as a part of motion detection as well as to provide a basis for object recognition. One description of motion that can be quantified is known as "optical flow". As used in this description and in the appended claims, the words "optical flow" is defined as the two-dimensional velocity field in the image plane induced by the relative motion of a sensor and/or the object(s) in the field of view. The underlying theory of the optical flow at the pixel level is briefly described below.
Lot E (x, y, t) be the intensity at a pixel, whose co-ordinates are (x, y) in the image at time t.
One of the earliest assumptions, now known as the brightness constancy assumption, about the spatial and temporal variations of the image brightness in a field of view states that: ##EQU1## This equation (1) can be expanded to give: EQU E.sub.x u+E.sub.y v+E.sub.t =0, (2)
where E.sub.x, E.sub.y and E.sub.t are the partial derivatives of E with respect to x,y, and t respectively and (u,v) are the optical flow components of pixel (x,y) at time t. The foregoing equation is an underconstrained equation in the unknowns (u, v) which cannot be solved without imposing additional constraints.
Certain early researchers provided mathematical techniques for solving equation (2) although the term optical flow was not used. A. N. Notravali, J. D. Robbins: "Motion-Compensated Television coding: Part I", The Bell System Technical Journal, Vol. 58, No. 3 March 1979; and J. O. Limb, J. A. Murphy: "Estimating the Velocity of Moving Images from Television Signals", Computer Graphics and Images Processing, 4 1975.
Horn and Schunck (B. K. P. Horn, B. G. Schunck, "Determining Optical Flow", Computer Vision, J. H. Brady, ed., North-Holland Publishing, 1981.) proposed an iterative technique for computing optical flow. They propose a solution to the brightness constancy equation (2) by imposing a smoothness constraint on the optical flow field and minimizing an error functional in terms of accuracy and smoothness. This technique is considered the standard for pixel level optical flow computation. One problem with this technique is that it smooths across motion boundaries of objects as it is a global optimization process. This tends to smear out any motion discontinuities along occluding objects or along the figure-ground boundaries in the field of view.
Other techniques have attempted to solve the brightness constancy equation (2) by proposing various constraints. Kearney (J. K. Kearney, "The Estimation of Optical Flow", Ph.D. Dissertation, Department of Computer science, University of Minnesota, 1983.) proposed a least squares approach where it is assumed that the optical flow is constant within a surrounding neighborhood of a given pixel. A constraint is obtained from each pixel of this neighborhood leading to an overconstrained system of linear equations in terms of the optical flow components. These equations are then solved by means of a least squares approach. Typically this technique produces flow fields that look "blocky" and is generally less accurate than other techniques (J. J. Little, A. Verri, "Analysis of Differential and Matching Methods for Optical Flow", Proc. Workshop on Visual Motion, 1989.).
Schunck has suggested another approach to optical flow determination (B. G. Schunck, "Image Flow: Fundamentals and Algorithms", Motion Understanding-Robot and Human Vision, W. N. Martin and J. K. Aggarwal, ed., Kluwer Academic Publishers, 1988.). This approach transforms the brightness constancy equation (2) into a polar form and is convenient for representing image flows with discontinuities as the polar equation will not have delta-functions at discontinuities. Every constraint equation defines a line in the two-dimensional velocity space of optical flow components. If pixels in a region have similar velocity vectors, their corresponding lines will intersect in a tight cluster. The technique basically searches for the tightest cluster and assigns the central velocity of that cluster to the pixel. This technique is known as the "constraint line clustering" technique. The output has to be smoothed to improve the overall coherence of the velocity field. Schunck suggests use of an edge detection step on the velocity field components before smoothing, so that smoothing can be carried out only within closed regions and not cross motion boundaries.
Koch and others have addressed the problem of smoothing across discontinuities by using the concept of binary line processes which explicitly code for the presence of discontinuities (Christof Koch, "Computing Optical Flow in Resistive Networks and in the Primate Visual System", Proc. Workshop on Visual Notion, IEEE, 1969). Binary horizontal and vertical line processes are used. If the spatial gradient of optical flow between two neighboring pixels is greater than a predetermined threshold, the optical flow field is considered "broken" and the appropriate motion discontinuity at that location is switched on. If little spatial variation exists, the discontinuity is switched off. The line process terms are encoded as a modification of the Horn and Schunck minimization terms for smoothness. An additional "line potential" term encourages or discourages specific configurations of line processes.
While the foregoing brightness constancy optical flow equation (2) has been used as a basis of many optical flow techniques, it should be noted that the imposition of the various constraints discussed above results in optical flow values which are heuristic approximations rather than analytically exact solutions of the brightness constancy equation. Additionally, it is not a realistic assumption in many cases to assume that the brightness corresponding to a physical point in the three-dimensional world remains unchanged with time. This will not be true, for example, when points on an object are obscured or revealed in successive image frames, or when an object moves such that the incident light hits a given point on the object from a different angle between image frames. This will cause the shading to vary.
Tretiak and Pastor (O. Tretiak, L. Pastor, "Velocity Estimation from Image Sequences with Second Order Differential Operators", Proc. Int. Conf. on Patt. Rec., 1982) suggested that a constrained set of equations can be obtained by employing second order differentials to obtain the following: EQU E.sub.xx u+E.sub.xy.sup.v +E.sub.xt =) (3) EQU E.sub.xy u E.sub.yy.sup.v +E.sub.yt =0 (4)
where E.sub.xx, E.sub.xy and E.sub.yy are the second order partial derivatives of E with respect to xx, xy and yy respectively, and Ext and Eyt are the second order partial derivative of E with respect to xt and yt respectively. These equations (3, 4) provide a constrained set of equations which can be solved for optical flow vectors (u, V). Together with the brightness constancy equation (2) the foregoing equations (3, 4) provide an overconstrained set of equations which can be solved for optical flow vectors (u, v) using least square methods. Equations (3,4) are referred to as the gradient constancy optical flow equations assuming a spatial gradient constancy in the image sequence as opposed to equation (2) which assumes brightness constancy in the image sequence. An optical flow technique based upon gradient constancy would not be influenced by intensity changes which do not change the spatial gradient. The accuracy of gradient constancy based techniques, however, also can be affected by shading changes mentioned previously.