During image processing, such as image recognition, image classification, and image searches, a process called “convolution” may be performed to extract features of the target image. Convolution is a sum-of-products operation in which a kernel corresponding to a feature to be extracted is superimposed and applied to the target image while being moved in parallel across the target image. Through convolution, a feature map is generated from the target image and the kernel being used. One element in the feature map is calculated by superimposing the kernel on a partial region that is centered on one element in the target image and summing the products of the corresponding elements in the partial region and the kernel.
However, in place of the fundamental computation method described above, convolution may also be performed via an alternative computation method that uses a Fourier transform. In this alternative computation method, the target image and the kernel are both converted into frequency domain data by way of a Fourier transform, and corresponding elements in the frequency domain data of the target image and the frequency domain data of the kernel are then multiplied. After this, the multiplication results are converted into spatial domain data by an inverse Fourier transform. According to the convolution theorem, the spatial domain data outputted by this alternative computation method will match the feature map outputted by the fundamental computation method. The alternative computation method may be executed at higher speed than the fundamental computation method.
An image searching apparatus that, searches for a desired image from compressed image data has been proposed. The proposed image searching apparatus separates a direct current (or “DC”) component of each block from image data that was compressed using a discrete cosine transform (or “DCT”). Based on the separated DC component, the image searching apparatus extracts blocks that satisfy an inputted search condition and restores a bitmap image from the extracted blocks using an inverse discrete cosine transform (or “IDCT”). When the restored bitmap image satisfies the search condition, the image searching apparatus outputs the image as the search result.
A component shape recognizing method that recognizes the shapes of components from compressed image data has also beer, proposed. The proposed component shape recognizing method extracts partial images in which an edge or corner of a component appears, performs a discrete cosine transform on the extracted partial images, and generates compressed data from which coefficients of high frequency components have been excluded. The compressed data is inputted into a neural network, which has been trained in advance through supervised learning, and edge or corner shapes are recognized.
See, for example, Japanese Laid-open Patent Publication No. 09-44519 and Japanese Laid-open Patent Publication No. 10-187978.
During convolution using the Fourier transform described above, in the process that generates a feature map from the target image and a kernel, frequency domain data corresponding to the target image and frequency domain data corresponding to the kernel are generated and a multiplication result of both data is generated. This means that there is the problem of an increase in the amount of data stored during the convolution operation, and for a neural network with multiple layers, an increase in memory consumption.