1. Field of the Invention
The present invention relates to an image processing apparatus and image processing method that performs a filter operation process on pixel data of an image stored in an image memory by scanning the pixel data using a filter kernel, and relates to a program for causing a computer to execute the image processing method.
2. Description of the Related Art
Conventionally, pixel arrays are two-dimensional, particularly in the field of image processing, and therefore two-dimensional data operations, such as two-dimensional filter processes are performed frequently. With respect to such computations, in Japanese Patent Laid-Open No. 2004-13873 (Patent Document 1), multiple operators, such as product-sum operators, are prepared, and pixel data of an operation target image (image on which an operation is to be performed) supplied to the operators is shared and concurrently processed among the multiple operators. This is done in an attempt to achieve high-speed operation processing and efficient use of the image data of the operation target image.
Patent Document 1 uses an operation apparatus (image processing apparatus) configured as shown in, for example, FIG. 8.
FIG. 8 is a block diagram illustrating an example of the configuration of an operation apparatus (image processing apparatus) that performs a two-dimensional filter operation on image data, as indicated in the conventional example. The operation apparatus shown in FIG. 8 can concurrently perform filter operation processes on multiple pixels (in FIG. 8, 4 pixels) of an output image.
The operation apparatus (image processing apparatus) shown in FIG. 8 is configured of a memory 500 that holds pixel data of an operation target image, a shift register 501 capable of parallel input/parallel output, and product-sum operators 560 to 563.
The product-sum operators 560 to 563 are configured of multipliers 530 to 533, adders 540 to 543, and registers 550 to 553. The registers 550 to 553 respectively store the results of the product-sum operations produced using the multipliers 530 to 533 and adders 540 to 543.
The shift register 501 is configured of selectors 510 to 515 that select whether to load data in parallel or shift the data, and registers 520 to 525.
FIG. 9 is a schematic diagram illustrating an input image (operation target image), a filter kernel, and an operation output image (image resulting from the operation) in the case where filter operation processing is performed using the operation apparatus shown in FIG. 8. To be more specific, 901 indicates the input image (operation target image), 902 indicates the filter kernel, and 903 indicates the operation output image (image resulting from the operation). Meanwhile, FIG. 10 is a time chart for when the operation apparatus of FIG. 8 is run using the images and so on illustrated in FIG. 9.
Operations performed by the conventional image processing apparatus shall be described using FIGS. 8, 9, and 10.
First, at time t0 in FIG. 10, pixel data D00, D10, and so on up to D50, of the operation target image within the input image 901, are read out from the memory 500. At this time, the selectors 510 to 515 select the inputs from the memory 500 and output these inputs to the registers 520 to 525, respectively.
Next, at time t1 in FIG. 10, the pixel data D00, D10, and so on up to D50 outputted by the selectors 510 to 515 are loaded into the registers 520 to 525, respectively, and at the same time, the pixel data D00, D10, D20, and D30 are outputted to the product-sum operators 560 to 563. Simultaneously, a filter coefficient W00 in the filter kernel 902 is outputted to the product-sum operators 560 to 563, and the product-sum operation is performed thereby. At this time, the outputs of the selectors are switched so as to select the outputs of the previous registers.
Next, at time t2 in FIG. 10, the pixel data D10, and so on up to D50 outputted by the selectors 510 to 514 are shifted to the registers 520 to 524, respectively, and at the same time, the pixel data D10, D20, D30, and D40 are outputted to the product-sum operators 560 to 563. Simultaneously, a filter coefficient W10 in the filter kernel 902 is outputted to the product-sum operators 560 to 563, and the product-sum operation is performed thereby along with the results obtained thus far (the results held in the registers 550 to 553). At this time, the outputs of the selectors do not change, with the output of the previous registers being selected.
Next, at time t3 in FIG. 10, the pixel data D20 and so on up to D50 outputted by the selectors 510 to 513 are shifted to the registers 520 to 523, respectively, and at the same time, the pixel data D20, D30, D40, and D50 are outputted to the product-sum operators 560 to 563. Simultaneously, a filter coefficient W20 in the filter kernel 902 is outputted to the product-sum operators 560 to 563, and the product-sum operation is performed thereby along with the results obtained thus far (the results held in the registers 550 to 553). At this time, the outputs of the selectors are switched so as to select the input from the memory 500.
By repeating these operations, filter operation results R11 to R41 are stored in the registers 550 to 553 at time t10. Furthermore, by repeating these operations while causing the filter kernel to scan the operation target image, it is possible to perform the filter operation process across the entire surface of the input image.
However, in Patent Document 1, the operation output images are outputted concurrently, and thus there is a problem that a decimating filter operation process (sub-sampling filter operation process) cannot be carried out effectively.
The “decimating filter operation process” (“sub-sampling filter operation process”) refers to an operation process that decimates the operation output image rather than decimating the operation target image and then performing a filter operation process. In normal filter operation processes, the operation output image is obtained by performing the filter operation process while shifting the filter kernel one pixel at a time with respect to the operation target image. As opposed to this, in the sub-sampling filter operation process, the operation output image is obtained by performing the filter operation process while shifting the filter kernel multiple pixels (for example, two pixels) at a time.
For example, a sub-sampling filter operation process that shifts the filter kernel by two pixels at a time (called a “two-to-one sub-sampling filter operation process” here) has a computational amount ¼ of that of a normal filter operation process, making high-speed operations possible.
For example, when a two-to-one sub-sampling filter operation process is performed on the operation target image illustrated in the input image 901, the operation output image illustrated in FIG. 11 is obtained (the vertical/horizontal size of the operation output image is approximately half that of the operation target image).
However, if this sub-sampling filter operation process is performed using the apparatus of Patent Document 1, four pixels in the horizontal direction of the operation output image are outputted concurrently (for example, R11, R21, R31, and R41 are outputted simultaneously), and therefore the output results are decimated. Therefore, the technique of Patent Document 1 cannot effectively reduce the computational amount (of course, the computational amount can be reduced in the vertical direction, which means that the computational is ½ that of a normal filer operation process). In this case, the results of the operations performed by the product-sum operators 561 and 563 (R21 and R41) are ultimately decimated, and are thus wasted.
In other words, even if the normal filter operation process is replaced with a sub-sampling filter operation process in order to reduce the computational amount of the filter operation process, as with the conventional technique, there is a problem that only a disappointingly low reduction in the computational amount can be obtained. This problem means that, for example, even if a two-to-one sub-sampling filter operation process is performed with the goal of reducing the computational amount to ¼, the actual computational amount can only be reduced to ½.