This invention relates generally to the field of video processing, and in particular to the field of compressing, encoding, and decoding video images.
Video images are encoded and compressed to reduce the size of the data sets needed to communicate these images. The size of the compressed encoding of an image can affect various aspects of a video system""s performance. Storage requirements, bandwidth requirements, and transmission rate requirements are all directly correlated to the size of the encoded image. The size of the compressed encoding can also have an effect on image quality. MPEG, for example, is a lossy encoding: if the encoding exceeds a space or time constraint imposed by an MPEG standard, the encoding is truncated to fit the available space. That is, if an encoding is too complex, in terms of the amount of information that needs to be transferred in the time allowed for the transfer, a loss of quality occurs. Similarly, transmission of images over relatively low bit rate channels requires a reduction in the amount of data to be transferred, which is usually effected by reducing the resolution and quality of the transmitted image.
In general, the size of the compressed encoding of an image is dependent upon the content of the image, and the techniques used to encode the image. Traditionally, the fields of video processing and graphics image processing use different processes and techniques to provide images to potential viewers. Video image processing is primarily raster based. An image is scanned using a predetermined pattern to produce a modulation of a signal, the signal is communicated to a receiver, and the receiver applies the modulation to a display to recreate the image. Various techniques are used to compress the encoding of the raster image to optimize transmission efficiency and speed, including MPEG and other encodings.
Graphics image processing, on the other hand, is primarily object based. An image is composed of a variety of objects, each object occupying a particular area of the graphics image. The objects may correspond directly to actual objects in the image, such as boxes and circles, or may correspond to created objects, such as a multitude of triangular segments of areas of the image having similar characteristics. The graphics encoding of an image includes the identification of the type of object, such as line, circle, triangle, etc., and the location in the image at which the object appears. Also associated with each object are parameters that describe how the object is to be rendered at the specified location, such as the object""s color, size, texture, shading, translucency, etc. These parameters may be included in the identification of the type of object (e.g. a red circle), or in the specification of the location of the object in the image (a circle at (x,y), color=red).
In both the graphics and raster compressed encodings, large areas of uniform characteristics are efficiently encoded. A large blue square in an image is encoded in a graphics encoding as a square of a given size having a color of blue located at a given coordinate in the image space. The raster scan of a monochromatic area, such as a large blue square, produces bands of relatively constant modulation; these constant modulations are efficiently compressed during the discrete cosine transformation (DCT) process that is common to compression schemes such as MPEG.
Textured areas, on the other hand, will not necessarily be efficiently encoded by a DCT transformation, because the modulation is not constant. For example, a brick wall that has red bricks and gray mortar between the bricks will produce differing modulations as the raster scan traverses each red area and each gray area during the scanning process. Similarly, a marbled surface, consisting of random streaks of grain of differing colors amongst gray-white clouds of varying intensity will produce a non-uniform modulation pattern. Such areas, however, can be efficiently encoded as graphics objects having particular texture characteristics (e.g. brick wall at (x,y), colors=red, gray). Conversely, images containing somewhat randomly placed objects may be more efficiently encoded as a compressed raster encoding. An outdoor scene, for example, may be efficiently compressed by a DCT transformation, but may not be efficiently encoded as a graphics encoding of each object that forms the image, such as each leaf on a tree in the scene.
Conventional video processing of an image produces an encoding of the image that is independent of the display that may be used to display the image. In general, the image is raster scanned at a predetermined horizontal and vertical frequency and resolution and encoded so as to retain as much image information as possible. This detailed image information is provided to a 3xe2x80x3 portable display, or a 36xe2x80x3 wall display, regardless of the display""s ability to reproduce this information. Within the same display, also, the same detailed image information is processed regardless of whether the image is displayed on the full screen, or a portion of the screen, such as a picture-in-picture window. In addition to the inherent inefficiency of this information transfer, the conversion of high resolution image information for display on a lower resolution display, or a small area of a high resolution display, also requires the use of anti-aliasing filtering techniques to remove the excess information prior to display. In the red brick wall with gray mortar example above, a low resolution display with appropriate anti-aliasing will display the wall as a uniform area of an off-red color. Attempting to display the details of the gray mortar, without anti-aliasing, will typically result in a display of a red wall with arbitrary gray moire patterns.
Thus it is seen that neither conventional video processing nor conventional image processing provides superior performance and efficiency compared to the other under all circumstances. It is also seen that conventional video processing does not provide for an encoding scheme that is optimal for differing display devices.
Therefore, a need exists for an encoding technique that provides the advantages of both video processing and image processing. In particular, a need exists for an image encoding technique that allows for a minimal sized encoding of an image, without the loss of quality or resolution that conventionally occurs when video encodings are reduced in size. A need also exists for an image encoding technique that allows for a decoding process that is dependent upon the characteristics of the display that is used to render the decoded image.