By using the wavelet representation of a signal it is possible not only to decompose the signal into different frequency bands, as is done by Fourier analysis, but also to obtain different representations of the signal at different scales. The extension of this decomposition to two-dimensional (2D) signals also provides information on the orientation of the irregularities present in the image.
The wavelet transform has qualities which are useful for signal and image processing, because it enables a temporal, frequency and multiscale representation of a signal to be obtained. This method of analysis is a logical development of Fourier transform analysis, the dimensional analysis associated with this wavelet transform enabling the signal to be studied at different scales.
By contrast with the analysis of a continuous signal, the mother wavelet translation and expansion parameters vary in a discrete manner for the analysis of a discrete signal. The progressive increase of the expansion factor is obtained by applying the wavelet filtering operation recursively to the same discrete signal.
By way of illustration, a one-dimensional (1D) signal is decomposed by the following procedure: the original signal is decomposed into a detail signal Wd1 amplifying the high-frequency components of the signal and into an approximation signal Wa1 which provides a smoothed view of the original signal. The latter is, in turn, decomposed into a detail signal Wd2 and an approximation signal Wa2. This procedure is then repeated until the desired number of levels of decomposition have been obtained. A series of detail signals, Wd1, . . . , Wdn is obtained, together with a signal Wan, all of which enable the input signal to be analyzed on different scales.
By using orthogonal decomposition filters it is possible to reconstruct the information and suppress the redundancy present in the approximation and detail bands. It is therefore possible to apply an operation of subsampling by a factor two to the output of each filter, thereby providing a concise representation of the information, the number of samples between the input and the output of the transform being conserved.
2D decomposition of a signal is carried out by successive decomposition along the horizontal axis and then along the vertical axis. Three detail subimages LH, HL and HH are obtained, together with an approximation subimage. The signal HH, corresponding to the result of two successive high-pass filters, is commonly called the diagonal signal. Similarly, the signals LH and HL can be related to vertical and horizontal signals respectively. The approximation signal is decomposed again to obtain a multiscale decomposition. The resulting coefficients are interlaced in terms of both orientation and scale.
FIG. 1a shows an interlaced representation of the decomposition of an image. FIG. 1b shows a non-interlaced representation of the decomposition.
2D architectures of the DWT type (an acronym for the English expression “Discrete Wavelet Transform”) may be classified according to two different methods. The first method is called “row-column” and the second method is called “row by row”.
FIG. 2 shows the row-column decomposition method. The operation is divided into two 1D transforms, of which one, 200, is horizontal and the other, 201, is vertical. The horizontal transform operation 200 is applied initially to an input image 204. The transformed coefficients are placed in a short-term memory 202 with a size of N×M, corresponding to the size of the initial image. The vertical transform operation 201 is then applied in a second step, to provide a final image 205. This structure has the advantage of being easy to implement, since the two transforms 200, 201 are totally independent. A major drawback of this method is the large amount of memory required for this type of implementation. This memory, exclusively reserved for the transform operation, has a size of N×M and must be located close to the processing units to avoid costly reading and writing of the coefficients. This memory significantly increases the total surface area of the component executing the coding. Consequently, this type of structure is little used in practice.
FIG. 3 is a schematic illustration of the nature of a row-by-row transform. This type of transform can reduce the intermediate memory cost. This is because, in this type of implementation, a row memory 302 is used, with a size of L×M, where L is the size of the filter used. This memory replaces the image memory, and allows the short-term storage of the coefficients produced by the horizontal transform unit 300. By way of example, L=3 for the CDF 5/3 wavelet.
The vertical transform operation 301 is then launched as soon as the number of rows in memory is sufficient to apply the vertical filter. With this type of implementation, it is possible to change from an image memory with a size of N×M to a row memory with a size of L×M. For the 9/7 transform, five rows of coefficients must be stored. Only four rows of coefficients are required in some architectures. An inherent drawback of the use of this method is that finer control is required, both in reading to and writing from the buffer memory 302, and in the launching of the vertical transform 301.
FIG. 4 shows an example of an architecture that requires no intermediate memory for storing the wavelet coefficients between the vertical and horizontal passes.
An architecture for reducing the memory requirements was proposed in the paper by Peng Cao, ChaoWang, Jun Yang and Longxing Shi entitled Area-efficient line-based two dimensional discrete wavelet transform architecture without data buffer, ICME 2009, IEEE International Conference on Multimedia and Expo, pages 1094-1097, 28, 3 Jul. 2009. This architecture does not require an intermediate memory, known as a transposition memory, for storing the wavelet coefficients between the vertical and horizontal passes. It also allows the execution of a 5/3 or 9/7 decomposition, direct or inverse according to the configuration. The input pixels 400 are scanned row by row, one pixel being transmitted in each calculation cycle. Two elementary modules 401, 402 are used. A first module 401 performs the vertical transform, using CPE (Column Processing Element) calculation elements 403, 404. A second module 402, receiving the results 407 from the preceding unit 401, performs the horizontal transform, using RPE (Row Processing Element) calculation elements 405, 406. The resulting approximation coefficients are stored 408 in a RAM (Random Access Memory) to enable the next level of decomposition to be executed. Here again, a problem of memory arises. In this example, a quarter of an image has to be stored in the RAM 408.
The paper by H. Liao, M. K. Mandal, and B. F. Cockburn, entitled Efficient architectures for 1-d and 2-d lifting-based wavelet transforms, IEEE Transactions on Signal Processing, 52(5): 1315-1326, 2004, proposes the execution of a multi-level 2D decomposition using a recursive implementation of the DWT transform. The transform module is composed of two 1D transform units, one horizontal and one vertical, separated by a plurality of FIFOs used for storing the coefficients of each level. These FIFOs are used to accumulate a sufficient number of coefficients for the execution of the column by column transform, for each transform level. It was decided that the processing of the first level of decomposition would be carried out in only one out two calculation cycles, so that the higher levels of decomposition could be inserted between the first-level decompositions. Although the architecture ultimately requires only four lines of short-term memory for the execution of the CDF 9/7 transform, a complicated control system is required to manage the various FIFOs as well as the delay registers. This architecture performs the multi-level decomposition of an image by ordering the decomposition tasks related to the different levels on the same row. Since all the levels of decomposition can be processed on the same row, the decomposition unit is used intensively during the processing of these rows, and not at all during the following rows, where there is no need to carry out several levels of decomposition. It is also necessary to reconfigure the processing unit data path at each calculation cycle during the transform rows according to the current level of decomposition. Furthermore, the number of storage registers in the horizontal decomposition units must be increased, and specific addressing must be developed, so as not to overwrite the intermediate coefficients within a row. Thus this choice greatly increases the critical path length, and reduces the overall performance of the transform module. This architecture uses two rows of memory to delay the data produced between the different levels of decomposition and a complex mechanism is used to upload and download the coefficients to and from these memories.
Many on-board applications require the use of wavelet transforms. It is therefore important to find architectures that enable the requisite memory size to be reduced, with the lowest possible difficulty of implementation.
In the rest of the description, the CDF 5/3 and CDF 9/7 filters are used as examples. These are wavelet filters invented by Cohen, Daubechies and Fauveau. Each of these filters has specific properties. The 5/3 filter allows loss-free compression and decompression. The 9/7 filter offers a better compression rate, but introduces losses. The equivalent size of these filters is three coefficients for the 5/3 and five coefficients for the 9/7 filter. Although these two filters are used as examples, the invention can be applied using any type of filter.