Moving Pictures Experts Groups (MPEG) is an International Standards Organization (ISO) standard for compressing video data. Video compression is important in making video data files, such as full-length movies, more manageable for storage (e.g., in optical storage media), processing, and transmission. In general, MPEG compression is achieved by eliminating redundant and irrelevant information. Because video images typically consist of smooth regions of color across the screen, video information generally varies little in space and time. As such, a significant part of the video information in an image is predictable and therefore redundant. Hence, a first objective in MPEG compression is to remove the redundant information and leaving only the true or unpredictable information. On the other hand, irrelevant video image information is information that cannot be seen by the human eye under certain reasonable viewing conditions. For example, the human eye is less perceptive to noise at high spatial frequencies than noise at low spatial frequencies and less perceptive to loss of details immediately before and after a scene change. Accordingly, the second objective in MPEG compression is to remove irrelevant information. The combination of redundant information removal and irrelevant information removal allows for highly compressed video data files.
MPEG compression incorporates various well-known techniques to achieve the above objectives including: motion-compensated prediction, Discrete Cosine Transform (DCT), quantization, and Variable-Length Coding (VLC). DCT is an algorithm that converts pixel data into sets of spatial frequencies with associated coefficients. Due to the non-uniform distribution of the DCT coefficients wherein most of the non-zero DCT coefficients of an image tend to be located in a general area, VLC is used to exploit this distribution characteristic to identify non-zero DCT coefficients from zero DCT coefficients. In so doing, redundant/predictable information can be removed. Additionally, having decomposed the video image into spatial frequencies under DCT means that higher frequencies via their associated DCT coefficients can be coded with less precision than the lower frequencies via their associated DCT coefficients thereby allowing irrelevant information to be removed. Hence, quantization may be generalized as a step to weight the DCT coefficients based on the amount of noise that the human eye can tolerate at each spatial frequency so that a reduced set of coefficients can be generated.
However, when a highly compressed video data file is decompressed, image degradations involving noise artifacts may occur in the decompressed video images. Generally, there are two types of degradation noise artifacts: blocking and ringing. A blocking artifact is typically a discontinuity between adjacent video pixel data blocks. Blocking artifacts are created when DCT coefficients of video pixel blocks are quantized and processed independently without paying consideration to the between-blocks pixel correlation. A ringing artifact is typically a local flickering near an edge. Ringing artifacts are created when high frequency DCT coefficients are truncated as a result of coarse quantizations.
FIG. 1 is a block diagram illustrating a prior-art post-processing architecture to eliminate both blocking and ringing artifacts from decompressed video data. As shown in FIG. 1, post-processing architecture 100 includes one-dimensional (1D) blocking filter 120 and two-dimensional (2D) ringing filter 130 which are coupled to memory 110 that stores a pixmap of the decompressed (a.k.a. processed) video image data. Decompressed video image data from memory 110 can be accessed independently. In so doing, de-blocking and de-ringing processes can be performed separately. Since blocking filter 120 is a 1D filter, horizontal and vertical blocking artifacts of decompressed pixel data are filtered at different times. On the other hand, since ringing filter 130 is a 2D filter, all ringing artifacts can be substantially filtered out concurrently.
FIG. 2 is a flow chart illustrating the operation steps of post-processing architecture 100. Starting with step 210, 1D blocking filter 120 receives from memory 220 a plurality/set of decompressed pixel data (e.g., 10 pixels data) that corresponds to data between two adjacent data blocks (wherein each block is made up for example of 8×8 pixels) within an image frame. In step 220, 1D blocking filter 120 filters out the blocking artifacts in the horizontal direction. Next, blocking filter 120 sends the horizontally de-blocked pixel data back to a second location in memory 110 (step 230). Blocking filter 120 then receives the horizontally de-blocked pixel data that corresponds to the same block of the image frame from the second location in memory 220 (step 240). Blocking filter 120 filters out the blocking artifacts in the vertical direction (step 250). Blocking filter 120 then sends the vertically and horizontally deblocked pixel data back to the second location in memory 110 (step 260). Next, ringing filter 130 begins the ringing filtering process by receiving the plurality (e.g., 18×18 pixels data) of the horizontally and vertically de-blocked pixel data from memory 220 (step 270). Ringing filter 130 then filters ringing artifacts from the block of pixel data (step 280). The de-ringed pixel data is sent back to memory 110 (step 290). Next, a determination is made as to whether all blocking and ringing artifacts in all the blocks in the image frame have been filtered out (step 295). If not, steps 210-295 are repeated. Otherwise, stop. The same filtering process for the next frame may then begin.
As demonstrated above, post-processing architecture 100 requires 3 memory read and 3 memory write accesses for filtering the blocking and ringing artifacts from every block of video (pixel) data. If both blocking filter 120 and ringing filter 130 are 2D filters, then 2 memory read and 2 memory write accesses are required for each block of video (pixel) data. If both blocking filter 120 and ringing filter 130 are 1D filters, then 4 memory read and 4 memory write accesses are required for each block. More memory access means more time is required for the filtering process as well as more resources (e.g., processor time to control and monitor the memory access process) devoted to the filtering process is needed.
Thus, a need exists for a more efficient and less memory intensive post-processing apparatus, system, and method to remove blocking and ringing artifacts from decompressed video image data.