Conventional image processor architectures operate frame by frame, that is, frames are first captured, then codified in digital domain, and finally processed. This approach benefits from the enormous computational power of digital processors in scaled down technologies, but it is neither the most efficient one in terms of processing speed (time lag from inputs to actions) nor in terms of energy consumption.
Scale and rotation invariant feature detectors are used in different image processing tasks such as object detection and classification, image retrieval, and image registration or tracking. Their invariant nature yields repeatability, which permits to deal with occlusion, or with scenes acquired under different conditions such as illumination, or different view angles. Modern scale- and rotation-invariant feature detectors as the Scale Invariant Feature Transform (SIFT) are complex image processing techniques with a high computational cost, making difficult its realization using regular microprocessors and software. A key part of this algorithm is the extraction of Gaussian pyramids, which comprise a set of images of different resolutions called octaves. Every octave is the result of a ¼ downscaling of the previous octave. In turn, every octave is made up of a series of images called scales. Every scale is the result of performing a Gaussian filtering with given width (σ-level) on a previous scale.
A feature detector algorithm was proposed in Harris and Stephens (Proc. Avley Vis. Conf., Manchester, pp. 147-152, 1998), whose main advantage is that the computations involved to obtain image features, called Harris corners, are not time-consuming. However, it offers poor results dealing with changes in scale and rotation in images.
The scale invariant feature detector algorithm (SIFT), presented in U.S. Pat. No. 6,711,293, is an image processing method to obtain scale and rotate invariant features from digital image. Its main limitation is given by the computational requirements of the method, which makes difficult its use on applications requiring real-time operation (e.g., to achieve a frame rate of 24 frames per second or higher).
To deal with the limitations of the original SIFT algorithm, a Field Programmable Gate Array (FPGA) implementation was presented by Bonato et al., (IEEE Trans Circuits Syst., 18(12), pp. 1703-1712, 2008). However, the image acquisition is not integrated within the processing cores, which slows down the processing. Additionally, the high power consumption of FPGAs makes it hard to integrate into a low power system for computing vision tasks.
Yao et al. (International Conference on Field-Programmable Technology, 2009. FPT 2009) introduced an FPGA implementation of the SIFT algorithm. As in the case of Bonato et al., in this disclosure image acquisition is not taken into account in the development of the system.
Kiyoyama et al., (IEEE International Conference on 3D System Integration, 2009), performed a study of a parallel signal processing circuit, which includes a pixel circuit and a parallel analog-to-digital converter (ADC) with hierarchical correlated double sampling (CDS). This disclosure focuses on image acquisition but does not address how to create the processing core.