1. Field of the Invention
The present invention relates to pattern identification apparatuses and control methods and programs thereof, and particularly relates to techniques for recognizing images or analyzing information using accumulated information.
2. Description of the Related Art
Multidimensional array information is often used in the field of information processing. Here, the sum value of elements within the range of a specific region is often found and used in some processes related to image processing, image recognition, image composition, and so on, and in statistical processes and the like.
In the field of computer graphics, a concept regarding accumulated information for original input image information, called a rectangular “summed-area table”, has been proposed by F. C. Crow (F. C. Crow, “Summed-Area Tables For Texture Mapping”, Computer Graphics, 1984; referred to as “Crow” hereinafter). The “summed-area table” discussed in this document is a two-dimensional array that is the same size (that is, has the same number of elements) as the input image. Assuming that the pixel value of coordinates (x,y) in the input image is I(x,y), a component C(x,y) at a position (x,y) in the summed-area table is defined as indicated in the following Formula (1).
                              C          ⁡                      (                          x              ,              y                        )                          =                              ∑                                                            x                  ′                                ≤                x                                                              y                  ′                                ≤                y                                              ⁢                      I            ⁡                          (                                                x                  ′                                ,                                  y                  ′                                            )                                                          (        1        )            
In other words, as shown in FIG. 7, the sum value of pixels within a rectangle in the original input image (7a), whose origin is a position (0,0) and whose opposing corner is a position (x,y), is a value C(x,y) of a position (x,y) in the summed-area table (7b). Note that although Crow describes the position of the origin of the original summed-area table as the lower-left of the image, this specification uses the upper-left as the origin in order to adhere to the descriptions given hereinafter.
According to this definition, the sum of I(x,y) within a given rectangular region located horizontally or vertically in the input image can be found simply by referring to four points in the summed-area table, using the following formula. For example, as shown in FIG. 8, to find the sum C(x0,y0;x1,y1) of pixel values within a rectangular region that takes (x0,y0) and (x1,y1) as its opposing corners, Formula (2) may be computed as follows.C(x0,y0;x1,y1)=C(x0−1,y0−1)−C(x0−1,y1)−C(x1,y0−1)+C(x1,y1)  (2)
Through this, the sum of the values within a given rectangular region in an image can be found quickly. Japanese Patent Laid-Open No. 2008-299627 discloses a configuration in which accumulated information is computed based on Formula (2).
In the field of image recognition, Viola and Jones refer to the same type of accumulated information as the aforementioned summed-area table as an “integral image”. Pattern identification is carried out by calculating feature amounts in multiple local areas using this integral image (P. Viola, M. Jones, “Rapid Object Detection using a Boosted Cascade of Simple Features”, Proc. IEEE Conf. on Computer Vision and Pattern Recognition, Vol. 1, pp. 511-518, December 2001). The “local area” refers to a partial region in an image region that has been cut out from an input image.
While feature amounts are calculated for multiple local areas in this pattern identification, parameters found in advance through learning are employed as the feature amounts. The parameters include information such as the position, size, and so on of the local areas for which the feature amounts are to be calculated.
In pattern identification, an extremely high number of local areas are referred to, and accumulated information is read out frequently in order to calculate the feature amounts. For example, in the case where accumulated information has been stored in a single-port memory, from which only a single piece of data can be read out at a time, the readout of four vertices is processed in series through four memory accesses. Assuming that a single memory access takes a single cycle, a minimum of four cycles are required in order to find a single rectangular region.
There are cases where a bottleneck occurs with memory accesses, depending on the detection conditions (that is, the framerate, image size, number of objects to be detected, and so on). There is demand, in such a case, for increasing the speed by making it possible to process some or all of the accesses in parallel, rather than carrying out the four readouts in series.
Storing the accumulated information in a dual-port memory, which is capable of two readouts at a time, in order to reduce the amount of readout time can be given as one way of reducing the readout time. Writing the same accumulated information into four single-port memories and reading out the four vertices from respective memories at the same time can be given as another way.
Furthermore, Japanese Patent Laid-Open No. 2008-102792 discloses dividing an image into multiple integrated areas, generating integral images for each integrated area using the pixel positioned at any one of the corners as the origin, and calculating the sum of cutout regions of the image based on the integral images. According to this configuration, the number of readouts is reduced by aligning one of the corners or sides in the readout region with the pixel that is to serve as the base point for the generation of the integral image.
However, a method that increases the number of ports, such as one that uses a dual-port memory, will also increase the scale of the circuit by the number of ports that have been added. Although it is necessary to increase the number of ports to four in order to simultaneously read out four vertices, this also increases the mounting restrictions and is thus difficult to implement. Meanwhile, the method that stores the same values in four memories results in a circuit scale that is four times as large, and is thus difficult to implement when large memory sizes cannot be secured.
The method disclosed in Japanese Patent Laid-Open No. 2008-299627 reduces the scale of the circuit by reducing the size of a buffer, rather than by storing accumulated information in a memory. Accordingly, that method does not accelerate the reading/writing from/to the memory.
Furthermore, the method disclosed in Japanese Patent Laid-Open No. 2008-102792 is problematic because it is necessary, depending on the readout position, to perform four readouts, as in the past. In other words, that method is useful only in ranges where one of the corners or sides in the readout region can be aligned with the stated base point pixel, but the aforementioned problems with the past techniques remain for readouts of four random points in the integrated area.