The present invention is generally related to two-dimensional discrete wavelet transform (2-D DWT), and more particularly to an architecture for performing the two-dimensional discrete wavelet transform.
Recently, a wide variety of microprocessors aimed on having the capabilities for simultaneously processing audio signals and video images are brought out due to the mushroom development in the very-large-scaled integration (VLSI) circuit. The technique of the two-dimensional discrete wavelet transform (2-D DWT) is the crux of the image compression standard of new generation, such as JPEG-2000 still image compression standard. Thus, the technique of the two-dimensional discrete wavelet transform will play a decisive role in the image compression/decompression system. Nowadays the research of the two-dimensional discrete wavelet transform in many applications such as audio signal processing, computer graphics, numerical analysis, radar target identification, is in progress. In general, the basic architecture of the two-dimensional discrete wavelet transform is composed of multirate filters. Because the quantity of processing data in practical applications, e.g. digital camera, is extraordinarily enormous, it is desirable to develop a high-efficient, low-cost architecture for performing the two-dimensional discrete wavelet transform.
The mathematical formulas of the 2-D DWT using the separable FIR filters for implementation are represented in the following equations:                                           x            LL            J                    ⁡                      (                                          n                1                            ,                              n                2                                      )                          =                              ∑                                          i                1                            =              0                                      K              -              1                                ⁢                                    ∑                                                i                  2                                =                0                                            K                -                1                                      ⁢                                                            g                  ⁡                                      (                                          i                      1                                        )                                                  ·                                  g                  ⁡                                      (                                          i                      2                                        )                                                  ·                                                      x                    LL                                          J                      -                      1                                                        ⁡                                      (                                                                  2                        ⁢                                                  n                          1                                                                    -                                              i                        1                                                              )                                                              ⁢                              (                                                      2                    ⁢                                          n                      2                                                        -                                      i                    2                                                  )                                                                        (        1        )                                                      x            LH            J                    ⁡                      (                                          n                1                            ,                              n                2                                      )                          =                              ∑                                          i                1                            =              0                                      K              -              1                                ⁢                                    ∑                                                i                  2                                =                0                                            K                -                1                                      ⁢                                                            g                  ⁡                                      (                                          i                      1                                        )                                                  ·                                  h                  ⁡                                      (                                          i                      2                                        )                                                  ·                                                      x                    LL                                          J                      -                      1                                                        ⁡                                      (                                                                  2                        ⁢                                                  n                          1                                                                    -                                              i                        1                                                              )                                                              ⁢                              (                                                      2                    ⁢                                          n                      2                                                        -                                      i                    2                                                  )                                                                        (        2        )                                                      x            HL            J                    ⁡                      (                                          n                1                            ,                              n                2                                      )                          =                              ∑                                          i                1                            =              0                                      K              -              1                                ⁢                                    ∑                                                i                  2                                =                0                                            K                -                1                                      ⁢                                                            h                  ⁡                                      (                                          i                      1                                        )                                                  ·                                  g                  ⁡                                      (                                          i                      2                                        )                                                  ·                                                      x                    LL                                          J                      -                      1                                                        ⁡                                      (                                                                  2                        ⁢                                                  n                          1                                                                    -                                              i                        1                                                              )                                                              ⁢                              (                                                      2                    ⁢                                          n                      2                                                        -                                      i                    2                                                  )                                                                        (        3        )                                                      x            HH            J                    ⁡                      (                                          n                1                            ,                              n                2                                      )                          =                              ∑                                          i                1                            =              0                                      K              -              1                                ⁢                                    ∑                                                i                  2                                =                0                                            K                -                1                                      ⁢                                                            h                  ⁡                                      (                                          i                      1                                        )                                                  ·                                  h                  ⁡                                      (                                          i                      2                                        )                                                  ·                                                      x                    LL                                          J                      -                      1                                                        ⁡                                      (                                                                  2                        ⁢                                                  n                          1                                                                    -                                              i                        1                                                              )                                                              ⁢                              (                                                      2                    ⁢                                          n                      2                                                        -                                      i                    2                                                  )                                                                        (        4        )            
where the J is the number of decomposition level, k is the filter length, g(n) and h(n) are the impulse response of the low pass filter G(z) and high pass filter H(z) respectively. xLL0(n1, n2) represents the input image.
Please refer to FIG. 1 which illustrates a three-level architecture for performing the two-dimensional discrete wavelet transform. Each decomposition level includes two stages, wherein the firs stage performs the horizontal filtering operation and the second stage performs the vertical filter operation. In the first level decomposition, the size of the input image is Nxc3x97N, the outputs are three decomposed subbands LH, HL, and HH all having a size of N/2xc3x97N/2. In the second level decomposition, the input is the LL band, the outputs are three decomposed subbands LLLH, LLHL, and LLHH all having a size of N/4xc3x97N/4. In the third level decomposition, the input image is the LLLL band, and the outputs are four decomposed subbands (LL)2LL, (LL)2LH, (LL)2HL, and (LL)2HH all having a size of N/8xc3x97N/8. The result of decomposition operation for level above three can be deduced by analogy.
Among the present architectures for performing the two-dimensional discrete wavelet transform, the most common and well-known architecture is the parallel filter architecture. The design of the parallel filter architecture is based on the modified recursive pyramid algorithm (MRPA) to dispersively interpolate the computations of the second and the subsequent levels in the computation of the first level. In the beginning, the MRPA is applied to the one-dimensional discrete wavelet transform (1-D DWT). The quantity of processing data in each level is half of that in the previous level due to decimation operation, and thus the total quantity of processing data is:                                           ∑                          L              =              1                        J                    ⁢                      N                          2                              L                -                1                                                    =                              N            +                          N              2                        +                          N                              2                2                                      +                          N                              2                3                                      +            …            +                          N                              2                                  J                  -                  1                                                              =                      2            ⁢                          (                              1                -                                  2                                      -                    J                                                              )                        ⁢            N                                              (        5        )            
where J is the number of level, N is the quantity of the processing data in the first level, N/2 is the quantity of processing data in the second level, . . . and N/2Jxe2x88x921 is the quantity of processing data in the Jth level. When the number of the level J is large enough, Eq. (5) can be simplified to Eq. (6):
2(1xe2x88x922xe2x88x92J)N≈2N=N+Nxe2x80x83xe2x80x83(6)
Because the quantity of processing data in the first level is identical to that in the second and the subsequent levels, the computing time of the first level can be filled up as shown in FIG. 2. In the mean time, the hardware will be fully utilized, and thus the MRPA is suitable for the one-dimensional discrete wavelet transform.
Nonetheless, we found that the MRPA is not suitable for the two-dimensional discrete wavelet transform. Please refer to FIG. 3 showing the two-dimensional discrete wavelet transform employing modified recursive pyramid algorithm (MRPA). Because the quantity of processing data in each level is one-fourth of that in the previous level, the total quantity of the processing data is:                                           ∑                          L              =              1                        J                    ⁢                                    N              2                                      4                              L                -                1                                                    =                                            N              2                        +                                          N                2                            4                        +                                          N                2                                            4                2                                      +                                          N                2                                            4                3                                      +            …            +                                          N                2                                            4                                  J                  -                  1                                                              =                                    4              3                        ⁢                          (                              1                -                                  4                                      -                    J                                                              )                        ⁢                          N              2                                                          (        7        )            
where J is the number of level, N2 is the quantity of processing data in the first level, N2/4 is the quantity of processing data in the second level, . . . , and N2/4Jxe2x88x921 is the quantity of processing data in the Jth level. When the number of level J is large enough, Eq. (7) can be simplified to Eq. (8):                                                         4              3                        ⁢                          (                              1                -                                  4                                      -                    J                                                              )                        ⁢                          N              2                                ≈                                    4              3                        ⁢                          N              2                                      =                              N            2                    +                                    1              3                        ⁢                          N              2                                                          (        8        )            
Because the quantity of processing data in the second and the subsequent levels (N2/3) is one-third of that in the first level (N2) the computing time of the first decomposition level will not be filled up and then the hardware will enter into idle state. That renders the hardware utilization low, and it requires a complex control circuit to process the interleading data flow among the levels.
Please refer to FIG. 4 which is a schematic diagram illustrating the parallel filter architecture. The parallel filter architecture includes four filters: Hor1, Hor2, Ver1, and Ver2. The transpose memories Storage1 and Storage 2 are used to perform transpose operation. The Hor1 performs horizontal filtering operation of the first level, Hor2 performs the horizontal filtering operation of the second and the subsequent levels, and Ver1 and Ver2 performs the overall vertical filtering operation.
Please refer to FIG. 5 which illustrates the operating configuration of the architecture of FIG. 4. The individual hardware utilization of the four filters and average hardware utilization can be evaluated as the following equations, where J is the number of level:
Hor1: 1xe2x80x83xe2x80x83(9)                               Ver1          ⁢                      :                    ⁢                      xe2x80x83                    ⁢                                    ∑                              L                =                1                            J                        ⁢                          1                              2                ·                                  4                                      L                    -                    1                                                                                      =                                            1              2                        +                          1              8                        +                          1              32                        +            …            +                          1                              2                ·                                  4                                      J                    -                    1                                                                                =                                    2              3                        ⁢                          (                              1                -                                  4                                      -                    J                                                              )                                                          (        10        )                                          Ver2          ⁢                      :                    ⁢                      xe2x80x83                    ⁢                                    ∑                              L                =                1                            J                        ⁢                          1                              2                ·                                  4                                      L                    -                    1                                                                                      =                                            1              2                        +                          1              8                        +                          1              32                        +            …            +                          1                              2                ·                                  4                                      J                    -                    1                                                                                =                                    2              3                        ⁢                          (                              1                -                                  4                                      -                    J                                                              )                                                          (        11        )                                          Hor2          ⁢                      :                    ⁢                      xe2x80x83                    ⁢                                    ∑                              L                =                2                            J                        ⁢                          1                              4                                  L                  -                  1                                                                    =                              0            +                          1              4                        +                          1              16                        +                          1              64                        +            …            +                          1                              4                                  J                  -                  1                                                              =                                    1              3                        ⁢                          (                              1                -                                  4                                      -                                          (                                              J                        -                        1                                            )                                                                                  )                                                          (        12        )                                          Average          ⁢                      :                    ⁢                      xe2x80x83                    ⁢                      1            4                    ⁢                      (                          Hor1              +              Ver1              +              Ver2              +              Hor2                        )                          =                              2            3                    ⁢                      (                          1              -                              4                                  -                  J                                                      )                                              (        13        )            
Table 1 lists the hardware utilization of the parallel filter architecture in different level:
It can be known from Table 1 that the hardware utilization of the first level of the parallel filter architecture is simply 50%. The hardware utilization will be converged to 66.67% with the increase of the level. That indicates that its hardware utilization is low.
In conclusion, though the MRPA is suitable for the one-dimensional discrete wavelet transform (1-D DWT), it is not suitable for the two-dimensional discrete wavelet transform (2-D DWT). The drawbacks of the parallel filter architecture are the irregular data flow, low hardware utilization, long computing time, and high control complexity. Therefore, it is necessary to develop an architecture for performing the 2-D DWT with a 100% hardware utilization, short computing time, regular data flow, low control complexity, and can be used to perform unlimited level decomposition operation without being limited by coefficients of the filter.
The primary object of the present invention is to provide an architecture for performing two-dimensional discrete wavelet transform with a 100% hardware utilization, short computing time, regular data flow, low control complexity, and can be used to perform unlimited level decomposition operation without being limited by coefficients of the filter.
According to the present invention, the architecture adapted to perform the two-dimensional discrete wavelet transform for performing multilevel decomposition operation to decompose an original image into a plurality of bands includes: a transform module for decomposing an input image into four bands, wherein among the four bands, the band having the low frequency in both horizontal and vertical direction serves as the input image for next level decomposition operation; and a multiplexer for selecting the band having the lowest frequency in both horizontal and vertical direction as the input image to feed into the transform module.
In accordance with the present invention, the architecture further includes a memory module for storing the band having the low frequency in both horizontal and vertical direction.
In accordance with the present invention, the storage size of the memory module is one-fourth of the size of the original image.
In accordance with the present invention, the transform module further includes a first stage consisting of decimation filters for performing the horizontal filtering operation and a second stage consisting of decimation filters for performing the vertical filtering operation.
In accordance with the present invention, the polyphase decomposition technique is employed to the decimation filters of the first stage for segmenting the coefficients of the decimation filters of the first stage into an odd-numbered part and an even-numbered part.
In accordance with the present invention, the coefficient folding technique is employed to the decimation filters of the second stage such that every two coefficients of the decimation filters of the second stage share one set of a multiplier, an adder, and a register.
In accordance with the present invention, the register of the decimation filter of the second stage is a row register including: multiple register blocks, wherein the number of said register blocks is the number of decomposition level of the architecture; multiple one-by-two demultiplexer, each of which is electrically connected between two register blocks for receiving the output of the previous register block as an input, wherein one output of each one of one-by-two demultiplexer serves as the input for next register block and the other output of each one of one-by-two demultiplexer serves as a part of the output of the row register; and multiple select signal lines, each of which is electrically connected to one corresponding one-by-two demultiplexer for selecting the output of the corresponding one-by-two demultiplexer.
In accordance with another aspect of the present invention, an architecture adapted to perform the two-dimensional discrete wavelet transform for performing a single level decomposition operation to decompose an original image into four bands includes: a transform module for decomposing the original image into four bands.
In accordance with another aspect of the present invention, the transform module further includes a first stage consisting of decimation filters for performing the horizontal filtering operation and a second stage consisting of decimation filters for performing the vertical filtering operation.
In accordance with another aspect of the present invention, the polyphase decomposition technique is employed to the decimation filters of the first stage for segmenting the coefficients of the decimation filters of the first stage into an odd-numbered part and an even-numbered part.
In accordance with another aspect of the present invention, the coefficient folding technique is employed to the decimation filters of the second stage such that every two coefficients of the decimation filters of the second stage share one set of a multiplier, an adder, and a register.