Image processing typically involves the processing of pixel values that are organized into an array. Here, a spatially organized two dimensional array captures the two dimensional nature of images (additional dimensions may include time (e.g., a sequence of two dimensional images), depth (e.g., of an intermediate layer in a convolutional neural network) and data type (e.g., colors)). In a typical scenario, the arrayed pixel values are provided by a camera that has generated a still image or a sequence of frames to capture images of motion. Traditional image processors typically fall on either side of two extremes.
A first extreme performs image processing tasks as software programs executing on a general purpose processor or general purpose-like processor (e.g., a general purpose processor with vector instruction enhancements). Although the first extreme typically provides a highly versatile application software development platform, its use of finer grained data structures combined with the associated overhead (e.g., instruction fetch and decode, handling of on-chip and off-chip data, speculative execution) ultimately results in larger amounts of energy being consumed per unit of data during execution of the program code.
A second, opposite extreme applies fixed function hardwired circuitry to much larger blocks of data. The use of larger (as opposed to finer grained) blocks of data applied directly to custom designed circuits greatly reduces power consumption per unit of data. However, the use of custom designed fixed function circuitry generally results in a limited set of tasks that the processor is able to perform. As such, the widely versatile programming environment (that is associated with the first extreme) is lacking in the second extreme.
A technology platform that provides for both highly versatile application software development opportunities combined with improved power efficiency per unit of data remains a desirable yet missing solution.