The present invention relates to methods for identifying objects of varying shapes, sizes and orientations within complex images.
Although the principles of this invention are equally applicable in other contexts, the invention will be fully understood from the following explanation of its use in the context of locating heads and faces within still or moving pictures.
Various applications necessitate the design of a method for locating objects, such as heads and faces, within complex images. These applications include, for example, tracking people for surveillance purposes, model-based image compression for video telephony, intelligent computer-user interfaces, and other operations.
A typical surveillance tracking method may involve the use of a camera installed in a fixed location such as a doorway. The camera conveys its collected images to a modular control system, which locates and identifies heads and facial features of people entering the doorway. Instances of identification may then be communicated to an appropriate source such as an alarm system. Still other applications involve cameras installed on a computer workstation for tracking heads and facial features of persons seated in front of the workstation. Such tracking information may, in turn, be used for workstation access by identifying persons authorized to use the terminal. The foregoing applications are exemplary in nature, as numerous additional applications may be contemplated by those skilled in the art.
Any proposed tracking method should be capable of performing effectively in a practical setting. Tolerance to variations in environmental parameters is highly desirable. For instance, a useful tracking method should function competently over a large range of lighting conditions. A tracking method should likewise be tolerable to variations in camera and lens characteristics and other scene parameters.
Algorithms for identifying faces in images have been proposed in the literature. While these algorithms may suffice for use in environments involving a limited range of conditions, they routinely fail when deployed in a practical setting. Such prior algorithms include simple color segmentation which relies on skin color distinctions to track faces in images. Color segmentation algorithms require analysis of the single parameter of color; they are consequently very fast. Color segmentation provides accurate tracking results where a sufficient contrast exists between skin colors and the background colors of the collected images. Where the skin colors are similar to the background colors, however, these algorithms are typically unreliable.
Most practical applications further require that the tracking method be non-intrusive to the people being observed. For example, a surveillance system at a bank would be unduly intrusive if individuals in the bank were restricted in their movements. A proposed tracking method should therefore permit the free, unobstructed motion of persons under observation. Disadvantageously, simple color segmentation is inadequate where quick or complex movement occurs in the collected images. The color segmentation algorithms often cannot sufficiently evaluate rapidly changing images. Thus, where the persons portrayed are in constant motion, accurate tracking using this method is extremely difficult. The problem escalates where the background colors in subsequent frames become similar to skin colors. In short, using simple color segmentation fails to address the tracking problems encountered in environments having constantly varying parameters such as lighting and motion.
Additionally, simple color segmentation relies on the evaluation of information from a single parameter to produce its results. Because color is the only parameter considered, the tracked results are often imprecise.
Other proposed recognition systems have been described in the literature which use multiple algorithms or classifiers. These classifiers typically rely on additional parameters, such as shape, motion, or other variables, to track the desired objects. Using additional parameters increases the accuracy of the tracked output. In these systems, several different classifiers evaluate an object independently, and then combine the results in a final step. This combinational step may be accomplished, for example, by a voting procedure. Other techniques combine the results of various classifiers using a weighted process that accounts for the error rates of each classifier. Generally, the use of additional parameters enables the tracking system to extract enhanced detail from the collected images. The accuracy of these algorithms and the robustness of the tracked output are therefore improved over that of simple color segmentation.
In addition, combining and integrating the final results provides information to the tracking system which may be used to train the system for subsequent tracking. Such training processes further increase the accuracy of algorithms based on more than one classifier.
One major disadvantage of existing multi-classifier algorithms is their substantially decreased tracking speed. Running a plurality of channels simultaneously requires complex and time-consuming computations. Thus the use of additional classifiers results in a much slower computation time. Moreover, analyses of shape parameters are usually slower than analyses of color parameters for a particular image. These differences in processing speed are attributed to the complexity of shapes within the collected images and the large number of computations required to identify combinations of shapes. For these reasons, speed advantages inherent in simple color segmentation are largely lost for algorithms involving combinations of classifiers.
Another problem with existing algorithms based on multiple classifiers is that each classifier typically operates independently of the others. No intermediate steps exist for comparing classifier results. The results are combined only as part of a final step in the process. As such, no single classifier may confirm the accuracy of its data, or compare its data with that of other channels, until the end of the analysis. This problem derives from the inherent characteristics of existing recognition systems. No relationship exists between the data gathered by one classifier and the data gathered by another. For example, one channel in the system may analyze and collect data based on the positioning of pixels on a screen, while another channel may generate tables of data based on an unrelated statistical extraction program. Results between such channels cannot be meaningfully compared until the end of the analysis, where complex algorithms are employed to combine and integrate the final data.
The problem is exacerbated where a channel has gathered inaccurate information for a large number of iterations. In such a case, the final result may be imprecise. Further, because the classifiers track their respective parameters independently, no ability exists for one classifier to calibrate another classifier before the latter initiates its analysis. These disadvantages result in more complicated algorithms and greater computation times.
The following needs persist in the art with respect to the development of algorithms for tracking objects in collected images: (1) the need for a tracking method which provides a more robust and accurate output; (2) the need for a tracking method which is considerably faster than existing algorithms based on multiple classifiers; (3) the need for an efficient tracking method based on multiple channels to enhance the accuracy of the output; (4) the need for a multi-channel tracking method where the accuracy of each channel is confirmed by results obtained from other channels; and (5) the need for a tracking method capable of simultaneously maximizing tracking speed and output precision.
It is therefore an object of the present invention to provide a tracking method which provides a more accurate and robust tracked output than existing algorithms.
Another object of the invention is to establish a tracking method which is faster than existing multi-classifier systems, and which achieves a maximum level of accuracy of the tracked result.
Another object of the invention is to provide a more efficient tracking method.
Another object of the invention is to set forth a multi-channel tracking method having the ability to confirm the accuracy of each channel's output by comparing results from other channels at various stages during the tracking process.
Another object of the invention is to provide a tracking method which produces an intermediate image for early evaluation and for optimal subsequent channel selection by the system.
Additional objects of the invention will be contemplated by those skilled in the art after perusal of the instant specification, claims, and drawings.