1. Field of the Invention
The current invention relates to the fields of digital image and video compression. It further relates to the use of a novel data re-ordering scheme to improve the performance of the compression process.
2. Description of Related Art
Digital pictorial information, whether derived from an analogue source by a process of digitization or directly from a digital device, consists of huge volumes of data. As the ability of devices to capture higher resolution images improves so too does the amount of data required for their digital representation. If stored in raw format a single image may well require tens of mega-bytes of disk space.
The problem is further exacerbated when considering digital video data, especially for high definition video. A two-hour movie when stored in raw form at the highest resolution ATSC frame size (1920×1080 pixels at 30 frames per second) requires almost 641 Gbyte of disk space. At a data rate of almost 89 Mbyte/s the bandwidth required for transmission goes way beyond what is currently available.
Image compression may be thought of as a special case of video compression where an image is considered to be a video sequence consisting of a single frame.
The encoding operation may be considered to be a three-stage process. First, a block predictor, created from data already available to the decoder, is subtracted from the original data to form a prediction error signal. Second, the prediction error is block transformed and quantized. Finally, the transform coefficients are entropy coded to form a binary bitstream that constitutes the compressed frame. the transform coefficients are entropy coded to form a binary bitstream that constitutes the compressed frame.
The prediction stage may involve spatial or temporal prediction for video. For image compression, with no available temporal data, the only prediction mode available is spatial.
Many of the more successful algorithms have a two-dimensional block transform method at their core, partitioning each frame into rectangular blocks (usually 8×8 or 4×4) and applying the transform to each. Compression is achieved by coding the transform coefficients more efficiently than the original spatial data can be coded.
The premise of any compression algorithm employing a block transform is that greater coding efficiency is achieved by concentrating the signal energy into the smallest possible number of non-zero transform coefficients. A good transform and quantization strategy tends to ensure that it is the lower frequency coefficients that are non-zero, preserving structural form at the cost of losing textural detail. As the level of quantization increases the aim is to achieve a level of graceful degradation in the resulting image quality.
The Discrete Cosine Transform (DCT) has received the most attention over the last thirty years or so, being the transform of choice in all of the MPEG video compression and the original JPEG image compression International Standards.
From the human visual perception standpoint, the lower frequency components account for the major structural framework and it is in the corresponding low frequency coefficients where most of the signal energy is concentrated. Higher frequency components fill in textural detail and enhance sharpness, but in real world material account for a relatively small proportion of the signal energy.
Blocks that contain very little textural detail and no sharply defined edges will have comparatively little high frequency content resulting in smaller magnitude, and fewer, non-zero coefficients. This results in more efficient entropy coding.