Image-processing applications typically include multiple functional processing blocks, hereinafter referred to as “nodes,” that are executed sequentially to convert raw image data into the final images presented to the user and/or to analyze the image data to extract information about conditions or objects they capture. In such applications, the algorithm that governs the required signal flow connecting the nodes (i.e., manages the input, output, temporary data storage and data transfer for the various functional blocks) generally forms the core of the application, and often consumes a significant part of the processing power—in particular, when implemented on a digital signal processor (DSP) or in hardware. FIG. 1 illustrates an exemplary signal flow of an algorithm for foreground “blob” detection, which may be used, e.g., to detect people, vehicles, or other objects in images. The first node 100 (‘ABS DIFF’) computes the pixel-wise difference in image values (e.g., grayscale-values) between an image and a background reference image. In the second node 102 (‘BINARY THRESHOLD’), the difference thus computed is thresholded against a fixed or adaptive threshold to produce a binarized image. The binarized image undergoes further post-processing in the ‘EROSION’ and ‘DILATION’ nodes 104, 106 to erode away noisy pixels and to enhance the binary image output. The final node 108 (‘CONNECTED LABELLING’) identifies connected pixels in the binary image and labels them as “blobs.”
Developing suitable program code to implement the signal and data flow (whether written in a low-level DSP language or a high-level language such as C or C++) is generally a daunting task for the algorithm or application programmer, and involves many levels of design optimization related to memory allocation, direct memory access, control, etc. It is, therefore, desirable to automate or semi-automate this task. There are programming tools available that auto-generate code from a diagrammatic representation of a signal flow created by the application developer in a graphical user interface (GUI). These tools generally support either sample-based or frame-based signal-flow architectures, where the processing nodes operate on individual data samples or entire frames, respectively. Sample-based tools are widely used for, e.g., audio-signal processing and motor control. However, they may be unsuitable for many image-processing applications, which generally require higher sample-processing rates, e.g., because a single image already contains a large number of data samples (i.e., pixels), and which, further, often include processing steps that operate on collections of samples (rather than on individual samples). For example, an image-smoothing step may involve replacing each pixel with an average over a block of several pixels, and a one-dimensional Fourier transform inherently requires an entire row of the input image for each pixel of the output image. Other tools operate on entire image frames. Processing complete image frames is, however, unnecessary in many circumstances. Further, in real-world image-processing applications implemented on DSPs or other special-purpose processors with limited local memory (rather than on a general-purpose computer), frame-based architectures require frequent accesses to external (off-chip) memory that render the system inefficient.
Accordingly, there is a need for signal-flow architectures that facilitate efficient image processing on DSPs and other hardware subject to memory and bandwidth limitations, as well as for tools that aid application developers in implementing such signal flows.