The invention relates to the mutual operation of several computation units which are together intended to evaluate in an iterative and cellular manner the convergence values of a plurality of variables respectively associated with the various points of a predetermined grid.
The invention applies advantageously but not limitingly to the processing of images, especially television images.
Algorithms of iterative type operating on a predetermined grid are encountered in a very large number of applications, including image processing. They make it possible to perform global processing over the whole of the grid but in a cellular manner, that is to say with local interaction only. Indeed, each variable associated with a point of the grid must satisfy a prespecified iterative relation between itself and only n neighbouring variables associated with n neighbouring grid points. Such iterative and cellular algorithms operating on a predetermined grid so as to culminate in convergence values are of the "relaxation" type according to the nomenclature commonly used by the expert.
Image processing algorithms of iterative and cellular type can thus for example operate on a global image spatially sampled by square grid cells with a neighbourhood of order 1. The prespecified relation to be satisfied by each variable, associated in this instance with each current image pixel, therefore involves five pixels, namely the current pixel at the previous iteration and the four immediate neighbours of this current pixel.
Iterative cellular models lend themselves a priori to a massively parallel implementation of each iteration by allotting an elementary processor of the architecture to each pixel. These processors will therefore together evaluate in an iterative and cellular manner the convergence values of the different variables respectively associated with the various pixels of the gridded image.
Parallel architectures of computation units are known which operate in a manner which is fully synchronous one with respect to another. However, the very large scale implementation of an architecture having to operate in a strictly synchronous fashion poses problems when the clock frequency, which is common to all the computation units, increases. Indeed, the offsets in the clock edges between the various points of the circuit may become non-negligible with respect to the period and hence entail a global slowing down of the operation of the architecture. Thus, it is difficult to use a clock having a frequency of the order of a hundred MHz with a circuit whose size is of the order of a cm.sup.2. Furthermore, although the synchronism of the conventional architectures is apparently a solution evident to the expert, it also poses convergence problems. Indeed, it turns out to be preferable not to update interdependent variables simultaneously, and this then leads to the use of a chessboard partition of the image. An architecture of computation units is therefore obtained which no longer operates in a fully parallel manner since only half the computation units are active simultaneously, supposing that one computation unit is assigned to each pixel.
The second solution which offers itself to the expert for the implementation of such cellular iterative algorithms is a fully asynchronous solution. With such a view in mind, and assuming that it is possible to assign one processor per pixel of the image, each processor then functions at its own rate, independently of the others, but must nevertheless, when computing its variable at a given iteration level, use the variables computed by the other processors at appropriate iteration levels bearing in mind the prespecified relations between the different variables. This therefore requires handling of the asynchronism and enquiry/acknowledgement procedures for the dialogue between the various processors and the exchange of the appropriate variables with a view to the correct determination of each variable within each processor. No practical embodiment of such a solution is currently known, especially in the processing of real images with one processor per pixel. Indeed, processors suitable for handling such asynchronism in a fully general manner must be microprocessors having high performance and hence large surface dimensions. In other words, the handling of the asynchronism and of the enquiry/acknowledgement procedures between processors of such an architecture is perceived by the expert to be incompatible with a fine granularity (the granularity corresponding to the number of pixels handled by a computation unit; the finest granularity being the association of one computation unit per pixel).