1. Field of Invention
The present invention relates generally to a Discrete Wavelet Transform (DWT) technique used in image compression standard, and more particularly, to a flipping algorithm for hardware realization of Lifting-based DWT.
2. Description of the Prior Art
Due to DWT has very good time-frequency decomposition results, many researches on signal analysis and compression based on the DWT have gained abundant results, more particularly, emerging image compression standards, such as JPEG2000 still image coding and MPEG-4 still texture coding, have adopted DWT as core algorithm. Comparing to some older generation of transform methods, such as Discrete Cosine Transform (DCT), which get involved in more volume arithmetic operations. Besides, comparing to the way of DCT to handle image blocks, basically DWT processes the whole image together so as to need more memory space and broader bandwidth which are the bottleneck of hardware realization of two-dimensional DWT (2-D DWT).
Due to DWT gets involved in only one pair wavelet filter operation in itself, so it will be much direct way to handle the operation by using Convolution. It can be expressed in term of mathematics as followed:
                                                        x              L                        ⁡                          (              n              )                                =                                    ∑                              i                =                0                                            K                -                1                                      ⁢                                          h                ⁡                                  (                  i                  )                                            ·                              x                ⁡                                  (                                                            2                      ⁢                      n                                        -                    i                                    )                                                                    ⁢                                  ⁢                                            x              H                        ⁡                          (              n              )                                =                                    ∑                              i                =                0                                            K                -                1                                      ⁢                                          g                ⁡                                  (                  i                  )                                            ·                              x                ⁡                                  (                                                            2                      ⁢                      n                                        -                    i                                    )                                                                                        (        1        )            
In which Xl(n) and Xh(n) are defined as the low- and high-pass signals respectively, and h(n) and g(n) as the coefficients of low-pass and high-pass filters respectively. Referring to FIG. 1, it shows a Convolution-based DWT hardware architecture which has two input and two output signals points per clock cycle with the least latency and minimum number of registers are required; Wherein Tm is defined as timing delay of multiplier, Ta as timing delay of an adder and Cm as a hardware cost of a multiplier, Ca as a hardware cost of an adder and K as length of filter. The critical path Tm+(K−1)Ta and the required hardware size 2KCm+2(K−1)Ca can be found in FIG. 1, by using adder tree that the critical path can be further lower to Tm+┌log2 K┐·Ta. But due to Convolution gets involved in more volume of calculation and the more complexity of control circuit in the Boundary Extension, thus Lifting Scheme is employed to aim to decrease the amount of DWT calculation and the complexity of control circuit in Boundary Extension and memory accessing. And, further a method to achieve Lifting Scheme by using the factorization of Poly-Phase Matrix, then the Lifting Scheme is more widely employed in the hardware and software realization of DWT. In Lifting Scheme any perfect reconstruction DWT filter pair can be factorized into a series of lifting steps. DWT Poly-Phase Matrix can be factorized into a series of upper triangles and lower triangles and one constant diagonal matrix, and it can be expressed in term of mathematics as followed:
                                                                        h                ⁡                                  (                  z                  )                                            =                                                                    h                    e                                    ⁡                                      (                                          z                      2                                        )                                                  +                                                      z                                          -                      1                                                        ⁢                                                            h                      0                                        ⁡                                          (                                              z                        2                                            )                                                                                                                                                              g                ⁡                                  (                  z                  )                                            =                                                                    g                    e                                    ⁡                                      (                                          z                      2                                        )                                                  +                                                      z                                          -                      1                                                        ⁢                                                            g                      0                                        ⁡                                          (                                              z                        2                                            )                                                                                                                                                              P                ⁡                                  (                  z                  )                                            =                                                [                                                                                                                                          h                            e                                                    ⁡                                                      (                            z                            )                                                                                                                                                                            g                            e                                                    ⁡                                                      (                            z                            )                                                                                                                                                                                                                    h                            0                                                    ⁡                                                      (                            z                            )                                                                                                                                                                            g                            0                                                    ⁡                                                      (                            z                            )                                                                                                                                ]                                =                                                      ∏                                          i                      =                      1                                        m                                    ⁢                                                                                    [                                                                                                            1                                                                                                                                                        s                                  i                                                                ⁡                                                                  (                                  z                                  )                                                                                                                                                                                                        0                                                                                      1                                                                                                      ]                                            ⁡                                              [                                                                                                            1                                                                                      0                                                                                                                                                                                                            t                                  i                                                                ⁡                                                                  (                                  z                                  )                                                                                                                                                    1                                                                                                      ]                                                              ⁡                                          [                                                                                                    K                                                                                o                                                                                                                                0                                                                                                              1                              /                              K                                                                                                                          ]                                                                                                                              (        2        )            Wherein h(z) and g(z) are defined as low pass and high pass filter, and P(z) is defined as corresponding Poly-Phase Matrix. By using JPEG2000 (9,7) filter set as example, (9,7) filter set can be disassembled into four lifting steps and one normalization step, which is illustrated in term of mathematics as followed:
                                                                        P                ⁡                                  (                  z                  )                                            =                            ⁢                                                                    [                                                                                            1                                                                                                      a                            ⁡                                                          (                                                              1                                +                                                                  z                                                                      -                                    1                                                                                                                              )                                                                                                                                                                            0                                                                          1                                                                                      ]                                    ⁡                                      [                                                                                            1                                                                          0                                                                                                                                                  b                            ⁡                                                          (                                                              1                                +                                z                                                            )                                                                                                                                1                                                                                      ]                                                  ⁡                                  [                                                                                    1                                                                                              c                          ⁡                                                      (                                                          1                              +                                                              z                                                                  -                                  1                                                                                                                      )                                                                                                                                                              0                                                                    1                                                                              ]                                                                                                                      ⁢                                                [                                                                                    1                                                                    0                                                                                                                                      d                          ⁡                                                      (                                                          1                              +                              z                                                        )                                                                                                                      1                                                                              ]                                ⁡                                  [                                                                                    K                                                                    0                                                                                                            0                                                                                              1                          K                                                                                                      ]                                                                                        (        3        )            
This can be expressed as a signal flow chart shown in FIG. 2, wherein black point is defined as computation node, grey point as register node, white point as input node, the computation node can be used to sum up all inputs. Due to K and 1/K can be realized independently outside Lifting Step or the stage following DWT is data compression, then normalization step can be proceeded with quantization procedure, so only the realization of lifting step is discussed here.
To realize convolution-based DWT(9,7) filter, plus using adder tree and due to the property of coefficient symmetry, there will be required 4 multipliers , 14 adders and 7 registers. But shown in FIG. 2, to realize Lifting-based DWT(9,7) filter, only 4 multipliers, 8 adders and 4 registers are needed (realization of K and 1/K are excluded here). Nevertheless, convolution based critical path is Tm+4Ta only, while the critical path shown in FIG. 2 is 4Tm+8Ta. By pipelining Lifting-based architectures can shorten the critical path effectively, but the number of registers is increased. For example, if cutting FIG. 2 into 4 pipelining stage, the critical path is decreased to Tm+2Ta, but 6 more registers are needed; It is a very serious problem to realize Line-Based two-dimensional DWT because the number of registers in the one-dimensional DWT architecture is positively proportional to the size of internal memory in the two-dimensional lined-based DWT hardware architecture.
In order to minimize the memory size of 2D-DWT realization, Line-Based method can be employed to decrease the demand for memory size by using adequate memory access management to reduce the whole image occupied memory size to only a couple line buffers of image width. Besides, Line-Based method can also be applied to decrease the accessing numbers to the external frame memory.
By increasing the internal line buffer in which the number of registers is positively proportional to the registers in the adopted one-dimensional DWT hardware. Referring to FIG. 4, which is the schematic diagram for the internal line buffer transformed from registers. FIG. 4(a) is the circuit diagram in the one-dimensional DWT hardware. FIG. 4(b) is the circuit diagram transformed into line-based two-dimensional DWT architecture, wherein R is defined as register; K0 as the number of one-dimensional DWT hardware; N as the width of image. Thus, the memory size indeed is positively proportional to the number of registers in one-dimensional architecture. Therefore, to minimize the memory size is the first priority consideration to realize the hardware of two-dimensional DWT under fixed hardware speed restrictions. But, the trade-off between critical path and line-based memory in the hardware realization of DWT never be revealed and discussed in any papers.
According to the previous stated, by using Lifting Scheme to realize hardware DWT has more merits than by using Convolution, and it needs less line buffer but longer critical path than that by using convolution. Although we can decrease the critical path by applying pipelining method but still more memory size will be needed. In this circumstance, by using Lifting Scheme to realize DWT hardware still faces a certain extent difficulty. Thus, in the light of the forgoing consideration, the present invention proposed a new Flipping algorithm aimed to solve the problem by using Lifting Scheme as starting point and flipping some Lifting steps to decrease the critical path while still retaining all the merits of Lifting Scheme. The Flipping algorithm of the present invention for sure has the best solution and efficiency in all aspects than that of Convolution.