The VC-2 video compression standard is an open free-use video-decoding standard contributed by British Broadcasting Corporation (BBC) to the Society of Motion Picture and Television Engineers (SMPTE) standard. The VC-2 standard uses discrete-wavelet-transform (DWT) and interleaved exponential-Golomb (IEG) variable-length-encoding to achieve the desired video compression. Originally designed to compete with the prevailing H.264 standard, it is expected that DWT results in fewer blocky artifacts than the prevailing discrete-cosine-transform (DCT)-based systems. To achieve the low-delay requirement in a serial data interface (SDI) transmission system, SMPTE standardized two low-delay profiles, which include the level-64 using the (2, 2) DWT, and the level-65, using the overlapped (5, 3) DWT. It has been shown that in order to fit a high definition (HD) video into a standard definition SDI (SD-SDI) payload with excellent video quality, the level-65 compression is required.
The VC-2 level-65 is a subset of the low-delay profile with the following attributes:                1. 4:2:2 10-bit sampling with supported resolutions 1920×1080i29.97, 1920×1080i25, 1280×720p59.94, 1280×720p50.        2. The codec uses only Low-Delay Profile.        3. The codec uses only the LeGall (5, 3) wavelet transform (wavelet index=1).        4. The wavelet depth is exactly 3 levels.        5. The slice size is fixed to be 16 (horizontal)×8 (vertical) in luminance and 8 (horizontal)×8 (vertical) in chrominance.        
Conventionally, overlapped DWT is used in the JPEG-2000 standard which is used extensively in digital cameras and medical imaging systems. In the literature, there are many publications on how to reduce the implementation complexity of 2-D DWT. A common property of this technology is that JPEG-2000 based implementation uses an external frame-buffer memory for processing the on-chip DWT/IDWT data. Thus, such publications have primarily focused on how to: minimize the read and write access to the external memory; reduce the on-chip internal memory; speed up data processing; and choose a scan scheme to minimize the memory usage. However, an external memory typically increases costs associated with the chip package size and power consumption, as well as the overall system complexity and bill-of-material (BOM) costs.