Production and transmission of information has undergone drastic changes in recent years. This evolution is mainly due to the availability of reliable and sophisticated digital communications networks, digital storage media, and digital compression specifications which have facilitated emission and management of a wide array of digital assets such as motion video, image, text, audio, data, and graphic information. Motion video due to its widespread application in various chains of digital infrastructures and the abundant information that it carries, has received significant attention from the research and development community. As a result, a number of methods have been developed to deal with encoding of moving pictures at various spatial and temporal sampling rates. These methods are intended to elevate the use of digital video in industry, encourage the enhancements of current products, and finally accelerate the definition of future products.
For example, the MPEG-2 international standard formed by the Moving Pictures and Expert Group, and described in ISO/IEC 13818-2, “Information Technology—Generic Coding of Moving Pictures and Associated Audio Information: Video, 1996,” which is hereby incorporated herein by reference in its entirety, adopts the tool-kit approach of “profiles” and “levels” to encompass the need of many factions within the broadcast, consumer, and entertainment sectors. “Profile” defines a subset of tools available to encode a video sequence while “level” deals with spatio-temporal resolution of a video source.
The book by B. G. Haskell, A. Puri, and A. N. Netravali, Digital Video: An Introduction to MPEG-2, Chapman and Hall, New York, 1997, which is hereby incorporated herein by reference in its entirety, explains various components of an MPEG-2 encoder in detail. Most digital video encoders rely on some form of an image analyzer, such as Discrete Cosine Transformation (DCT), to exploit intra-picture pixel-to-pixel redundancies, and motion estimation/compensation units to remove the inter-picture pixel-to-pixel redundancies. Since hardware realization of the above image processing techniques are more practical for rectangularly-shaped groups of pixels, the majority of specifications for digital video compression adopt a block-based approach of processing the image data.
A very efficient form of digital video compression is achieved by classifying a plurality of pictures into intra-coded and predicted (or inter-coded) pictures. For an intra-coded picture only the information from the same picture is used to perform the encoding procedure. On the other hand, the image data in inter-coded pictures is predicted by displacing information in other pictures within a defined search area. The concept of searching for the best prediction is known in the art as motion estimation. The difference of the prediction and the picture is then encoded. Therefore, decoding of inter-coded pictures require adding the decoded picture-difference to the displaced picture. The concept of displacing pictures during the decoding procedure is known in the art as motion compensation.
The use of motion estimation and motion compensation methods in inter-coded pictures helps greatly in reducing the amount of consumed bits. For cases where a good prediction is not found for a region of a picture, the encoder can revert back to the intra-coded method to carry out the compression task for this particular region of the picture. An intra versus inter switch can be easily derived for the video encoder. For ease of discussion, intra-coded pictures are referred as I coded and predicted-coded pictures are labeled P coded. The aforementioned description of a digital video encoder is clear with knowledge of the art of video compression. Further it is clear that a predicted picture would consume a lot less number of bits than an intra-coded picture. This methodology, although very efficient for producing professional quality video, requires a large encoder or decoder buffer size and consequently imposes a longer system delay. This is because the large intra-coded pictures of the bit-stream have to fit in the decoder buffer and secondly it takes longer for all the bits of this type picture to be in the buffer. On the other hand I pictures are very useful since they facilitate random accessing and further impose a bound on how long a corrupted region of the picture would leak into the rest of the compressed video stream.
A unique application for any type of digital video encoder is in the area of real-time video communications, where video-conferencing, video-phone, and monitoring compression systems with low encoding/decoding delay can be realized. Such products require a special set of features in order to be practical and cost effective.