With the advent of High Definition broadcasting, the delivery of HD video to cell phones, the high definition television, and the popularity of DVD movies, the term “make it real” has new meaning. Many of the high definition broadcasts are bringing a realism that can only be matched by looking through a window to watch the actual event unfold before you.
In order to make the transfer of high definition video more efficient, different video coding schemes have tried to get the best picture from the least amount of data. The Moving Pictures Experts Group (MPEG) has created standards that allow an implementer to supply as good a picture as possible based on a standardized data sequence and algorithm. The emerging standard H.264 (MPEG4 Part 10)/“Advanced Video Coding” (AVC) design delivers an improvement in coding efficiency typically by a factor of two over MPEG-2, the most widely used video coding standard today. The quality of the video is dependent upon the manipulation of the data in the picture and the rate at which the picture is refreshed. If the rate decreases below about 30 pictures per second the human eye can detect “unnatural” motion.
Due to coding structure of the current video compression standard, the picture rate-control consists of three steps: 1. Group of Pictures (GOP) level bit allocation; 2. Picture level bit allocation; and 3. Macro block (MB) level bit allocation. The picture level rate control involves distributing the GOP budget among the picture frames to achieve a maximal and uniform visual quality. Although Peak Signal to Noise Ratio (PSNR) does not fully represent the visual quality, it is most commonly used to quantify the visual quality. By using it as the criterion, various rate-control schemes may be proposed based on either iterative search or assumed theoretical rate/distortion models.
A GOP is made up of a series of pictures starting with an Intra picture. The Intra picture is the reference picture that the GOP is based on. It may represent a video sequence that has a similar theme or background. The Intra picture requires the largest amount of data because it cannot predict from other pictures and all of the detail for the sequence is based on the foundation that it represents. The next picture in the GOP may be a Predicted picture or a Bidirectional predicted picture. The names may be shortened to I-picture, P-picture and B-picture or I, P, and B. The P-picture uses less data content than the I-picture and some of the change between the two pictures is predicted based on certain references in the picture.
The use of P-pictures maintains a level of picture quality based on small changes from the I-picture. The B-picture has the least amount of data to represent the picture. It depends on information from two other pictures, the I-picture that starts the GOP and a P-picture that is within a few pictures of the B-picture. The P-picture that is used to construct the B-picture may come earlier or later in the sequence. The B-picture requires “pipeline processing”, meaning the data cannot be displayed until information from a later picture is available for processing.
In order to achieve the best balance of picture quality and picture rate performance, different combinations of picture sequences have been attempted. The MPEG-2 standard may use an Intra-picture followed by a Bidirectional predicted picture followed by a Predicted picture (IBP). The combination of the B-picture and the P-picture may be repeated as long as the quality is maintained (IBPBP). When the scene changes or the quality and/or picture rate degrades, another I-picture must be introduced into the sequence, starting a new GOP.
The theoretical model based methods do not consider the contents dependency between the reference and current frame while iterative search based methods have unacceptable huge complexity. Moreover, most of model based methods are using almost constant quantization scale through the whole picture. Although they claim good performance, when visual quality based MB rate control is utilized, their claim cannot be guaranteed.
The video coding in the MPEG-2 and MPEG-4 standards use a Quantization Parameter (QP) to control the video quality. A lower value of QP provides better quality pictures and a higher QP value provides lower picture quality. Practically, most of MPEG2 or MPEG4 encoders use a simple strategy: I-picture and P-pictures will use same Quantization Parameter (QP) if no buffer overflow or under flow happens, and a B-picture will use QP+2 as its quantization parameter. This strategy provides good results for pictures with very light complexity. However, the quantization parameter determination or picture bit allocation can not maintain picture clarity over long GOP's.
According to previous research results, the claim that a minimum PSNR difference between frames leads to visual quality fluctuation is not true. Visual quality may vary greatly based on the very little change in PSNR.
Thus, a need still remains for a less complex video coding system that can maintain good picture quality and clarity across long GOP's. These advanced coding schemes are required to bring High Definition video to personal video players, PDA's and video conferencing systems. In view of the increasing demand for viewing “real” events, it is increasingly critical that answers be found to these problems. In view of the ever-increasing commercial competitive pressures, along with growing consumer expectations and the diminishing opportunities for meaningful product differentiation in the marketplace, it is critical that answers be found for these problems. Additionally, the need to save costs, improve efficiencies and performance, and meet competitive pressures, adds an even greater urgency to the critical necessity for finding answers to these problems.
Solutions to these problems have long been sought but prior developments have not taught or suggested any solutions and, thus, solutions to these problems have long eluded those skilled in the art.