The present invention relates to a system enabling high-speed convolution processing of image data, for any scientific, industrial, medical, space or military application. For such applications, two techniques are known for processing into images data received from external sensors (in particular, viewing devices, but also acoustic sensors, etc.): one whereby "scattered" data is processed into conventional images; and another whereby in-coming data is already in the form of an image of the outside world, and processing consists in extracting and analysing the data contained in the image itself.
In both cases, much of the image conversion technology involved is based on convolution algorithms which, despite their extremely useful mathematical properties, introduce such lengthy processing times as to render application on standard computers unpractical.
Particularly in industry, for example, robot viewing systems require extremely high image processing potential especially at the early stages for filtering space frequencies and improving contrast, or for detecting image characteristics required for further higher-level processing. The starting data, which is usually supplied by an electronic retina containing as many as 200,000 transducers or more, is scanned at MHz frequencies and must all be processed separately.
Of all algorithm-based methods, by far the most widely used are the digital techniques which obviously require discrete image conversion. The image itself may be defined as a two-dimensional sequence of brightness value terms, and may be thought of as being obtained by dividing a photograph (square for the sake of simplicity) into a mosaic of N.N squares (pixels) of equal size and each of which is assigned a grey level.
In the field in which the system according to the present invention is to be applied, filtering is of major importance, specially for "edge extraction", which is an essential function, particularly on processing systems required to distinguish objects from a given background and subsequently identify them.
A discreet image, in fact, supplies all the information acquired externally by an "artificial viewing system", which information, however, may only be put to practical use if converted into a more efficent form than a mere sequence of different pixel brightness levels. "Edge extraction", for example, is an image conversion process which preserves much of the structural image data, thus facilitating subsequent processing.
An "edge extraction" process output consists of a group of lines located at brightness variation points on the image. The effectiveness of such lines lies in their being closely related to the properties of the objects being viewed, in that, obviously, brightness variations are often located on the edges of the said objects.
A convolution is only one "edge extraction" operation; subsequent operations in the process depending on the type of sequence employed.
The following is a brief summary of what is already known of the said convolution process.
If X is a sequence representing an image, and x(1,m) the generic term of the same sequence, wherein variables 1 and m (whole numbers between 0 and N-1) indicate the location (line or column) of the term in question, the sequence may obviously be considered a matrix of N lines and N columns, the elements of which are located according to the relative pixels from which they have been obtained.
In the case of a convolution, a starting sequence X produces a new sequence Y, each term of which is a weight sum of terms in the starting sequence situated about the term having the same location. Convolution processing therefore depends on the weight sequence, K, employed.
A convolution operation is indicated: EQU Y=K*X.
The generic term in the resulting sequence Y is: ##EQU1## Convolution processing is usually interpreted as filtering space frequencies.
For reliable results to be obtained, even in the presence of noise, filtering matrixes for "edge extraction" must in many cases be of relatively large size. Increasing the size of the K sequence, however, obviously increases the number of processing operations involved. Calculating each term in a sequence resulting from a two-dimensional-sequence convolution (having the usual characteristics) involves:
(i) knowing n.sub.R.n.sub.c terms in the sequence for processing; PA1 (ii) multiplying each term by an appropriate "weight" (the number of products thus being equal to n.sub.R.n.sub.c); PA1 (iii) adding together the resulting products (when adding pairs of numbers, the number of additions equals: (n.degree. of products to be added)-1=(n.sub.R.n.sub.c)-1.perspectiveto.n.sub.R.n.sub.c).
For application purposes, processing work must be simplified as far as possible, i.e. by reducing the number of operations required for each result.
When processing sequence K is separable, the whole process is considerably simpler. Separability for K means: EQU k(s,t)=f(s).g(t)
in which f(s) and g(t) are the terms of two one-dimensional sequences of n.sub.R and n.sub.c terms respectively.
If separability exists: ##EQU2## in which: ##EQU3## The calculation may thus be broken down into two consecutive convolutions involving one two-dimensional and one one-dimensional processing sequence.
W is a two-dimensional sequence of intermediate values obtained from X by means of a first one-dimensional line convolution (so called because the w(1,m) calculation involves only x(1,m-i) values located on the same line). Starting from the said intermediate values, the final sequence Y is obtained by means of a second one-dimensional convolution, this time a column convolution (so called because it only involves values in the same column). A one-dimensional line (column) convolution may be considered a special convolution case with a two-dimensional processing sequence with n.sub.R =1(n.sub.c =1). The following extension ma therefore be made: EQU g(j).fwdarw.g.sub.R (0,j) EQU f(i).fwdarw.f.sub.c (i,0)
(R or C indicates that the said two-dimensional sequences may only have terms other than zero along one line or column).
The separability condition already mentioned becomes a special case of: EQU K=F.sub.c *G.sub.R
For the entire calculation, therefore, the following equation may be employed: EQU Y=F.sub.c *(G.sub.R *X)=F.sub.c *W with W=G.sub.R *X
The first one-dimensional convolution, whereby W is calculated from starting sequence X, requires 1.n.sub.c operations per term (the term "operation" is generally intended to mean acquiring a value of the sequence to be processed, multiplying it by an appropriate term in the processing sequence, and adding the product). The second convolution, whereby sequence Y is obtained from sequence W, requires n.sub.R.1 operations per term.
This means the final sequence is calculated with a total of n.sub.R +n.sub.c operations per term, a considerable reduction as compared with direct processing, which would require n.sub.R.n.sub.c operations. And the higher the n.sub.R and n.sub.c values are, the greater the reduction will be.
What is more, actual performance of the calculation is simplified by replacing a two-dimensional convolution with two separate one-dimensional convolutions, each having only n.sub.R or n.sub.c terms.
The amount of processing work involved is a common problem in the case of direct numerical processing of discreet image sequence terms, on account of the large number of such terms involved. For example, in the case of an image having 512.512 pixels and a separable two-dimensional processing sequence of 32.32 terms, the number of operations to be performed would be over 3.10.sup.7, half of which are multiplications.
During the early processing stages, the original image must very often be processed using convolution equipment of different sizes and/or functions, for evidencing characteristics of special interest. Employing such algorithms on a standard computer, processing time may easily be as high as a few tens of seconds to a few minutes, depending on how complex the processing operation is. This is obviously unacceptable, even for the algorithm-simulating stages, and even moreso under real operating conditions, which generally require very fast processing times measurable in so many milliseconds per video image being acquired.
For combining image conversion precision and high-speed processing, therefore, two-dimensional masks in industry are limited to those which may be separated with no noticeable practical limitations. A two-dimensional convolution is therefore calculated by multiplying two consecutive one-dimensional convolutions in two perpendicular image directions.
The effectiveness of a dedicated machine for convolution processing, which may be employed on a computer in the same way as a standard peripheral unit, is therefore obvious. On a specialized machine, in fact, appropriate structuring may be selected for significantly reducing total processing time.