Embedded imaging applications on cost sensitive platforms usually involve digital data processors such as generic microprocessor, digital signal processors or specialized image coprocessors working with limited amount of on-chip memory. For most of these applications the amount of on-chip memory is not large enough to process an entire frame of an image. These applications typically use block processing, which is processing a small data block at a time through the processing algorithm stages. Image processing often has spatial dependency. Thus when an image is partitioned into same-sized blocks, an output block will require a larger input block. When output blocks are produced in raster-scan order, the input blocks overlap horizontally. Thus the input data for plural output blocks overlap. Furthermore, the image processing flow from input image to output image often involves multiple spatially dependent steps.
There are two conventional methods for processing and managing data arrays on-chip. These are over-processing and history buffer. Over-processing is simple in memory management, but inefficient in computation. History buffering is more efficient in computation, but conventionally takes time to move data within the history buffer. Thus there is a need in the art for a memory management technique that is easy to accomplish but also largely eliminates the need to move data, and thus achieves good computation efficiency.