A Local Extreme Point (LEP) is present in a pixel position when an image data value of the pixel position is a maxima or minima in relation to image data values of at least two pixel positions that are closest neighbors to said pixel position. There can be different reasons for wanting to identify a pixel as a LEP, i.e. to determine or decide that a pixel is a LEP. For example, an algorithm regarding Time-to-impact (TTI) estimation, see e.g. WO 2013/107525, is based on identification of LEPs. TTI aims at estimating the time when a possible collision may occur between a camera and an object seen by the camera when these are relatively moving towards, or away from, each other, the camera imaging the object by a sequence of images when it relatively approaches or moves away from the camera. The solution underlying said patented TTI estimation algorithm is based on an algorithm that estimates the “inverse” of the motion, i.e. how long an image feature stays at the same pixel position. The algorithm is based on identifying LEPs for this purpose. Owing to that operations could be independently performed on pixel positions and that the LEPs relate to very local data, computations could be made in parallel and implementation of the TTI algorithm were therefore well suited to be implemented on hardware architectures for parallel computing, for example Single Instruction Multiple Data (SIMD) type of processors. In particular implementations were well used for parallel architectures with processing capacity directly on or in close connection with images sensing circuitry, or even in close connection with single sensing elements. For example, the inventors could show that their LEP based approach with the TTI estimation algorithm drastically reduced computational load and also lend itself naturally to be implemented using a Near-Sensor Image Processing (NSIP) architecture, e.g. on an NSIP type of processor, which enables very cost efficient implementation and low power consumption.
NSIP is a concept described for the first time about 30 years ago, in which an optical sensor array and a specific low-level processing unit were tightly integrated into a hybrid analog-digital device. Despite its low overall complexity, numerous image processing operations can still be performed at high speed competing favorably with state-of-art solutions.
FIG. 1 is a schematic block diagram of an architecture of the first commercial implementation of the NSIP concept, the LAPP1100 chip. It comprises 128 processor slices, one per pixel. Beside the light sensing circuitry, each slice contains a tiny arithmetic unit and 14 bits of storage. Image data can be read-out from a shift register but also tested for the occurrences of one or more set bits (Global-OR) or the total number of set bits (COUNT) within the 128 bit line image. There is no Analog to Digital (A/D) converter on board. Instead, if A/D conversion is part of an application based on the LAPP1100 it can be implemented in software using one of several different principles. One is based on utilizing the approximately linear discharge that each CMOS photo diode exhibited during exposure to light. A selected number of registers together with an arithmetic unit may then be used to implement parallel counters that, for each pixel stopped counting when the photo diode reached a predefined level. However, A/D conversion is often not necessary. Many tasks, such as filtering for certain features or performing adaptive thresholding may just as easily be done by utilizing a pixel readout circuit of the chip in combination with a small bit processor available at each pixel. Experiences related to the LAPP1100 have been summarized and published under the name of NSIP.
FIG. 2 schematically shows basic light sensing parts a-f of the LAPP1100 for providing image data of a pixel. The capacitor b represents an inherent capacitance of the photo diode c. When the switch a is on, the diode pre-charges to its full value. As the switch is turned-off and the photo diode discharge due to photo-induced current, the voltage on the input of the comparator d decreases. At some level, this voltage passes a reference voltage e and an output f switches its logical value corresponding to image data of the pixel. The output, i.e. the image data that is a bit value, may then be processed in the bit-serial arithmetic-logical unit g. The light sensing parts a-f may be considered to correspond to a light sensing element or pixel readout circuit, and the bit-serial arithmetic-logical unit g may be considered to correspond to a computing element that also may be named a pixel processor or bit processor. Many tasks, such as filtering for certain features, histogramming or doing adaptive thresholding can be performed by utilizing the pixel readout circuit in combination with the bit processor available for each pixel. The output from the pixel readout can be referred to as binarized image data when it represents information that the image intensity is above or below the threshold. However, the duration from pre-charge to output switching includes full, or at least more, information of the image intensity, which can be utilized by the processor for A/D conversion or other intensity-related operations. The concept naturally gives a high dynamic range as well as a very high frame rate.
When explaining the processor part of the NSIP architecture it may be convenient to view it as a single processor with a word length that is equal to the number of pixels in its sensor part. The main part of the processor is the register file containing register words of the size of said word length. A second register is the accumulator. Later implementations of NSIP also contain other and/or additional registers to enhance certain types of processing. A first class of simple operations is “point operations” such as AND, OR etc. They typically apply between a register and the accumulator, modifying the accumulator to hold the new result. A second class of typically very useful operations is the “local operations” by a Neighborhood Logical Unit (NLU) in which a 3-element template may be applied simultaneously over a register to form a low-level filtering operation. A 1-dimensional example of such an operation is an operation “(01x) R1” which compares the template (01x) against each position in the word and generates a logical 1 where the template fits and a logical 0 otherwise. This particular template checks that the bit position itself has the value 1 while its left neighbor is 0 and the right neighbor is allowed to be either 1 or 0, i.e. “don't care”. This local operator may e.g. be useful when it comes to finding edges in an intensity image and also for finding local extreme points.
A third class of operations is “global operations”. These are used for many different purposes such as to find the leftmost or rightmost 1 in a register or to zero all bits from a certain position or to set a group of consecutive zero bits. The global operations are all derived from the mark operation which uses two input registers as operands. Set bits in the first register are viewed as pointers to objects in the second register. Objects are connected sets of 1's. Objects which are pointed to, will be kept and forwarded to the result.
With the above-mentioned operations at hand, one can implement most of typical low-level image processing tasks. Instructions are issued one at a time from an external or chip-internal sequencer or microprocessor over e.g. a 16 bit bus. Processed images can e.g. be read-out over the same bus or a dedicated I/O channel. However, most often it is sufficient to compute some specific scalar value such as the position of an image feature, the highest intensity value, a first order moment etc. For this reason, an NSIP architecture often contains a count status, COUNT, which is configured to always reflect the number of set bits in the accumulator as well as a global-OR which indicates if one or more bits in the accumulator is set. Thanks to such status information, applications based on NSIP often do not need to read out complete conventional images from the chip, thus speeding up the applications considerably. As an example the sum of all values f(i), each e.g represented by b bits in the processors may be found using only b COUNT operations and appropriate scaling and summing of the COUNT results.
When implementing embodiments herein on the NSIP architecture introduced above, LEPs are extracted from image data. One of the simplest operations to extract a LEP is to find local minima in a 3×1 neighborhood. This means that if a center pixel has a lower intensity compared to both its neighbors, then this pixel is a LEP. As recognized, finding such local minima can be accomplished using a basic NSIP NLU-operation but can also be done using other sequential operations. Also thanks to the NSIP concept explained above, there will be a high dynamic range which facilitate finding local minimum values in both bright and dark regions.
The following disclosures are example of some further implementations based on the NSIP concept.    Eklund J-E, Svensson C, and Aström A, “Implementation of a Focal Plane Processor. A realization of the Near-Sensor Image Processing Concept” IEEE Trans. VLSI Systems, 4, (1996).    El Gamal A., “Trends in CMOS Image Sensor Technology and Design,” International Electron Devices Meeting Digest of Technical Papers, pp. 805-808 (2002).    Guilvard A., et al., “A Digital High Dynamic Range CMOS Image Sensor with Multi-Integration and Pixel Readout Request”, in Proc. of SPIE-IS&T Electronic Imaging, 6501, (2007).
FIG. 3, is a diagram from a simulation just to illustrate LEPs in an image and for better understanding of LEPs. A row from a standard image has been taken and the LEPs have been marked. The LEPs have been identified in a local 3×1 neighborhood and correspond to local minima in this case. An NSIP operation to find the LEPs may be defined as (101) which means that if a center pixel has not passed its threshold but its two closest, i.e. nearest, neighbors have both passed the threshold, then the center pixel is a LEP that correspond to a local minimum point. In the figure, part of the image has been magnified to better illustrate the LEPs, indicated at black dots. Each row from the image used in the shown figure consisted of 512 pixels and in the shown particular case there are about 70 LEPs along an selected row.