1. Field of the Invention
This invention relates to object-based image/video coding using lapped orthogonal transforms (LOTs) and more specifically to using LOTs on arbitrarily shaped objects that are completely defined by rectangular and L-shaped regions.
2. Description of the Related Art
Current image/video coding standards such as JPEG, MPEG1 and MPEG2 use transforms such as the discrete cosine transform (DCT) to encode/decode digital imagery. Block transform based algorithms first subdivide the image into contiguous N.times.N blocks, typically 8.times.8 pixels, and then decorrelate each block by projecting it onto a set of orthogonal basis functions to create an N.times.N block of transform coefficients, which are quantized, entropy coded, and transmitted. The distortion in the reconstructed image is minimized for a given bit rate by allocating bits to the individual transform coefficients based upon their relative strengths. At moderate to high bit rates, the visual quality of the reconstructed image remains high because the quantization errors in the coefficients are distributed evenly throughout each transform block.
However, at low bit rates (high compression ratios), the coarse quantization of the transform coefficients produces artifacts between the blocks in the reconstructed image, which degrades image quality. The blocking effects are caused by the hard truncation of the blocks' basis functions, which produces discontinuities in their basis functions at their shared edges. The effect of the discontinuities is masked at higher bit rates.
As shown in FIG. 1, blocking effects are reduced by overlapping the transforms between adjacent blocks 10 in the image 12. This increases the number of encoding/decoding computations but does not increase the number of transform coefficients, and hence the bit rate is unaffected. The lapped orthogonal transform (LOT) is created by first extending each block's region of support 14 by amounts .epsilon.,.delta. in the horizontal and/or vertical directions, depending upon whether the block is an interior, edge or corner block. The extension forms rectangular overlapping regions 16 about the blocks' shared interior edges 18. Next, each block's basis functions are extended over its extended region of support in a manner that maintains their orthogonality such as an odd/even extension. A window function is defined on the block's extended region of support that has a value of one inside the block excluding the rectangular overlapping regions, a value of zero outside the block's extended region of support, and tapers from one to zero across the overlapping region, typically along a sinusoidal curve. To maintain orthogonality, the adjacent blocks' overlapping window functions are symmetric about their shared edges and preserve the energy of the LOT in the overlapping regions, i.e. the sum of the squares of the window functions equals one. The lapped orthogonal basis functions for the forward LOT are the product of the extended basis functions and their tapered window functions.
The forward LOT weights the lapped orthogonal basis functions by the pixel intensity values and integrates them over the block's extended region of support. Because the forward LOTs extend into adjacent blocks, the information that contains the intensity values for pixels in a block is distributed among the transform coefficients for that block and the adjacent blocks whose regions of support extend into the given block. Consequently, to reconstruct the block, the inverse LOT requires the transform coefficients from each of these blocks. Dropping any of the transform coefficients may introduce artifacts into the reconstructed image.
The emerging MPEG4 video coding standard supports object-based coding, which is particularly useful in video phony, teleconferencing, and news broadcasting applications. Object-based coding increases SNR performance and provides the flexibility to only encode/decode, enhance, scale, or zoom specific objects. If blocked transforms are used, each image frame 12 is subdivided into N.times.N blocks 10 as shown in FIG. 2 and segmented into a plurality of arbitrarily shaped objects 20 defined on the N.times.N grid. The objects include boundaries 22, the object blocks 24 inside the boundary, and the motion vectors that describe their interframe motion. This approach improves compression and reduces visual artifacts in the reconstructed background, but suffers from blocking effects at low bit rates.
The known LOT defined on rectangular regions can not be used to encode the arbitrarily shaped object 20 without either A) block coding edges in L-shaped regions of the object thereby incurring blocking artifacts along those edges, B) discarding some transform coefficients associated with non-object blocks 26 outside the boundary 22 that contribute to the reconstruction of pixels inside the object in order to maintain the bit rate thereby introducing edge artifacts into the reconstructed image, or C) retaining those transform coefficients to reconstruct the image thereby effectively double coding those non-object blocks and increasing the total bit rate. The problem occurs where the object 20 has an exterior and concave corner 28. The standard rectangular extension from the pair of interior edges 30 that meet at corner 28 would define an overlapping region that lies partly outside the object's boundary. Option A avoids the problem by block coding the edges 30 in the L-shaped region 32 around the corner 28, option B discards a portion of the transform coefficients needed to reconstruct the object, and option C retains the coefficients at the cost of an increased bit rate.
Therefore, when LOTs are used, the objects 20 are masked using a rectangular mask 33 that completely covers the object 20. This approach reduces blocking effects at all of the edges 18,30 inside the object 20 but wastes bits coding non-object blocks 26 inside the mask 33 at higher bit rates than they would be encoded as part of the background and may produce artifacts between the non-object blocks on opposite sides of the mask's boundary 22.