1. Field of the Invention
The present invention relates to pattern identifying apparatuses used in image recognition, information analysis, and the like.
2. Description of the Related Art
In the field of information processing, information in a multi-dimensional array is frequently handled. Among them, in statistical processing, some processing related to image processing, image recognition, and image synthesis, and the like, a sum value of elements within a specific regional range is often obtained and used.
In the field of computer graphics, F. C. Crow proposes a concept of accumulation information in a rectangular shape with respect to original input image information, which is called a summed-area table (F. C. Crow, “Summed-Area Tables For Texture Mapping”, Computer Graphics, 1984; hereinafter, “Document 1”). In this Document 1, the summed-area table is formed in a two-dimensional array, which is the same as an input image, and assuming that a pixel value at coordinates (x, y) of the input image is I(x, y), a component C(x, y) at the same position (x, y) in the summed-area table is defined as expression (1) below.
                              C          ⁡                      (                          x              ,              y                        )                          =                              ∑                                                            x                  ′                                ≤                x                                                              y                  ′                                ≤                y                                              ⁢                      I            ⁡                          (                                                x                  ′                                ,                                  y                  ′                                            )                                                          (        1        )            
In other words, the sum value of pixels within a rectangle having the pixels at an origin position (0, 0) and at the position (x, y), which form a diagonal of the rectangle, in an original input image shown in FIG. 7A is the value C(x, y) at the position (x, y) in the summed-area table shown in FIG. 7B. Note that although it is assumed in the explanation of the original summed-area table in Document 1 that the origin position is the lower left corner of an image, it is assumed in the present specification that the upper left corner is the origin, for the purpose of consistency with the later description.
According to the above definition, the sum of I(x, y) within an arbitrary rectangular region that is put horizontally or vertically on the input image can be obtained only by referring to four points on the summed-area table, using expressions below. For example, as shown in FIG. 8, the sum C(x0, y0; x1, y1) of pixel values within a rectangular region having (x0, yC) and (x1, y1) (X0<X1, y0<y1), which form a diagonal, can be obtained by calculating expression (2) below.C(x0,y0;x1,y1)=C(x0−1,y0−1)−C(x0−1,y1)−C(x1,y0−1)+C(x1,y1)  (2)
It is thereby possible to obtain the sum of values within an arbitrary rectangular region on an image, at a high speed. Note that the first argument of C in expression (2) is −1 in some cases, where C(−1, *) returns 0. Also, the second argument of C is −1 similarly in some cases, where C(*, −1) returns 0. “*” means “don't care”, and may be any values. Japanese Patent Laid-Open No. 2008-299627 (hereinafter, “Document 2”) describes one method for implementing accumulation information.
In P. Viola, M. Jones, “Rapid Object Detection using a Boosted Cascade of Simple Features”, Proc. IEEE Conf. on Computer Vision and Pattern Recognition, Vol. 1, pp. 511-518, December 2001 (hereinafter, “Document 3”), accumulation information that is equivalent to the aforementioned summed-area table is called an “integral image”. In Document 3, a feature amount is calculated in a plurality of local regions using the integral image to perform pattern identification. A “local region” indicates a partial region of an image region that is cut out of an input image. To calculate a feature amount used in pattern identification, a parameter that is obtained in advance by means of learning is used. The parameter includes information of the position, size, and the like of a local region whose feature amount is to be calculated. Usually, the position of each local region that is referred to in pattern identification is random, and accumulation information needs to be read at random.
For example, in the case where accumulation information is stored in a single-port memory from which one set of data can be read at a time, reading of four vertices (points A, B, C, and D in FIG. 8; hereinafter, the four vertices are the points A, B, C, and D in FIG. 8) for obtaining the sum of values within a local region is serialized, and is processed by performing memory access four times. Assuming that one cycle is taken for one memory access, four cycles are necessary for reading the four vertices.
With this method, if high detection performance (which depends on the frame rate, the image size, the number of detection targets, etc.) is required, memory access can possibly be a bottleneck. To achieve speedup, it is required that part or all of the four times of serialized reading are able to be processed simultaneously.
Methods for reducing the read time includes a method of using a dual-port memory from which two sets of data can be simultaneously read at a time. Two of four vertices can be read at a time when the dual-port memory is used, and it is thereby possible to reduce the read time from four cycles to two cycles. However, the dual-port memory has a problem in that the circuit scale is larger than that of the single-port memory.
As another method, it is also conceivable to write the same accumulation information in four single-port memories and read the four vertices from the respective memories in parallel. However, this method requires four times the memory size used in the method of using one memory.
Still another method is that disclosed in Japanese Patent Laid-Open No. 2008-102792 (hereinafter, “Document 4”). Document 4 describes a method of dividing an image into a plurality of images and creating accumulation information regarding the respective divided images, thereby enabling reduction of the number of times of reading in a case of obtaining a sum in rectangles that are in contact with boundaries between divided images.
Still another method is that disclosed in Japanese Patent Laid-Open No. 2012-48402 (hereinafter, “Document 5”). Document 5 describes a method of writing an integral image in a plurality of storage devices in accordance with a predetermined rule, thereby enabling parallel reading at the time of reading the integral image. Also, in Document 5, four vertices of the integral image can be read in parallel at the time of execution by imposing a restriction on the shape of each local region at the time of learning. Thus, the bottleneck of memory access is resolved, and an apparatus capable of high-speed reading is realized.
Furthermore, according to S. Yan, S. Shan, X. Chen, and W. Gao. “Locally assembled binary (lab) feature with feature-centric cascade for fast and accurate face detection”, 26th IEEE Conference on Computer Vision and Pattern Recognition, CVPR, 2008 (hereinafter, “Document 6”), it is proposed that a region to be referred to for calculation of a feature amount is read in the form shown in FIG. 15 in order to improve recognition accuracy. Reference regions shown in FIG. 15 are blocks (1550 to 1558) having the same width and height that are arranged in a tile-like form, and the sum of pixels within each block is calculated by reading 16 points, namely vertices 1501 to 1516 of the blocks. As described above, creativity has been exercised in various manners to read an integral image at a high speed.
However, feature amounts other than the integral image are also used in recognition processing. For example, there is a difference feature that is calculated by reading a difference between arbitrary pixels within an image. In a case of designing an apparatus for recognition processing that can handle various feature amounts including not only an integral image but also a difference feature and the like, and is capable of high-speed processing, a capability to achieve speedup of various memory reading patterns corresponding to the respective feature amounts is required.
However, the above-described methods are for speedup of calculation of specific feature amounts, and cannot be applied to various other feature amounts. Moreover, if a restriction is imposed on the shape of each local region for the purpose of speedup, it affects recognition accuracy in some cases, and it is therefore important to eliminate the restriction.