1. Field of the Invention
The present invention relates in general to object detection and tracking, and in particular to a system and method for performing sparse transformed template matching using three dimensional (3D) rasterization by statistically comparing and matching plural sets of digital data.
2. Related Art
Applications for automatic digital object detection and tracking, image registration, pattern recognition and computer vision analysis are becoming increasingly important for providing new classes of services to users based on assessments of the object""s presence, position, trajectory, etc. These assessments allow advanced and accurate digital analysis (such as pattern recognition, motion analysis, etc.) of the objects in a scene, for example, objects in a sequence of images of a video scene. Plural objects define each image and are typically nebulous collections of pixels, which satisfy some property. These pixels could be the result of some pre-processing operation such as filtering, equalization, edge or feature detection, applied to raw input images. Each object can occupy a region or regions within each image and can change their relative locations throughout subsequent images of the video scene. These objects are considered moving objects, which form motion within a video scene and can be automatically detected and tracked with various techniques, one being template matching.
Template matching is a class of computer algorithms that is used in many digital computer applications, such as image registration, pattern recognition and computer vision applications. A template matching algorithm defines a function (for example, a metric) that estimates the similarity between sets of digital data. In this case, one set of digital data is commonly referred to as a template and another set of digital data is referred to as an image, wherein the template is typically smaller than the image (for instance, the template can be a small portion of the image). In computer vision applications, the template usually represents an object of the image that is being tracked and detected (located) within the image. The object can be located by computing the metric at various locations (u, v) in the image and determining where the metric is maximized.
However, many systems that use template matching are not robust or flexible enough for advanced image registration, pattern recognition and computer vision applications due to unfavorable tradeoffs of functionality for performance (for example, restricting themselves to translations of the template). Therefore, what is needed is a system and method for comparing and matching multiple sets of data by transforming one set of data and performing statistical analyses on the multiples sets of data. Whatever the merits of the above mentioned systems and methods, they do not achieve the benefits of the present invention.
To overcome the limitations in the prior art described above, and to overcome other limitations that will become apparent upon reading and understanding the present specification, the present invention is embodied in a system and method for performing sparse transformed template matching using three dimensional rasterization by comparing and matching a first set of digital data to at least a second set of digital data.
In general, the system and method includes raster transforming at least one of the first set of digital data and the second set of digital data, placing multiple images of the digital data in texture memory as multiple textures, gathering statistics between the textures and statistically comparing and matching the raster transformed sets of digital data to appropriately corresponding portions of each other. The first or the second set of digital data can be transformed during statistical analysis to enhance statistical analysis of the digital data. As such, the present invention can automatically track and detect digital data in a digital scene. This is accomplished by transforming data representing either the elements to be tracked, such as objects, or the elements to be matched to the objects and simultaneously comparing and matching the objects to the scene.
In one working example embodiment of the present invention, the system includes a host processor executing software that implements an address generator, an acceptance tester and a statistical comparison processor. The host processor controls the entire process and initially renders or rasterizes the sets of data. The address generator generates addresses, which can reflect a transformation, for the first set of data and the second set of data to be compared. The addresses are used by filtering functions to generate per-pixel values, such as color values. The acceptance tester receives the per-pixel values and determines the pixels that are to be used to contribute to statistical analysis. The statistical comparison processor statistically analyzes the pixels between the first data set and the second data set for comparison purposes. The host processor then examines the statistical comparisons computed by the statistical comparison processor and makes further processing decisions. The process repeats until a desired result is computed, such as a match or non-match between the data sets.
Alternatively, the system can be implemented in a three-dimensional (3D) graphics rasterizer. In this embodiment, the system includes a frame buffer (a block of graphics memory that represents the display screen) and texture memory (a block of graphics memory that can contain portions of the display screen), in addition to the components discussed above. The first set of digital data can be stored in the frame buffer while the second set of data can be stored in the texture memory. Also, statistical generation can be performed by the rasterizer, with or without actually rendering or writing a 3D digital scene comprised of the digital data to the frame buffer.
In another working example, the system tracks the digital objects as templates within a digital image scene with a robust and flexible processing scheme. For example, in this embodiment, the system includes a rasterization processor that resamples either the templates or digital data of the scene to be matched to the templates using a perspective transformation. In one specific embodiment of this working example, multiple images can be placed in texture memory as multiple textures. Certain statistics can be gathered between textures for normalized correlation or other statistics can be recorded for variations and subsequent forwarding to a host processor.
This embodiment allows the images participating in the comparison to benefit from increased flexibility of texture coordinates and an increased efficiency of mipmapping and other optimizations for texture filtering. In addition, it""s scalable so that additional images can participate in a single statistics gathering operation simply by adding more texture stages. In another specific embodiment of this example, a hardware processor can be used with additional core logic to compute the statistics and a feedback mechanism can be used to forward the results back to the host upon request.
In all embodiments, rasterization and rendering techniques and advanced statistical generation and comparison of the present invention can be integrated to form a novel video graphics device or hardware video card for computer systems.
The present invention as well as a more complete understanding thereof will be made apparent from a study of the following detailed description of the invention in connection with the accompanying drawings and appended claims.