The invention relates to video encoding, and more particularly, to systems and methods for changing a rate-control setting during video encoding.
A video sequence (VS) can be seen as a series of static frames, requiring considerable storage capacity and transmission bandwidth. A 90-min full color video stream, for example, having 640×480 pixels/frame and 15 frames/second, requires bandwidth of 640×480 (pixels/frame)×3(bytes/pixel)×15(frames/sec)=13.18(MB/sec) and file size of 13.18(MB/sec)×90×60=69.50(GB). Such a sizeable digital video stream is difficult to store and transmit in real time, thus, many encoding techniques have been introduced to reduce the required memory size and transmission bandwidth.
MPEG standards ensure video encoders create standardized files that can be opened and played on any system with a standards-compliant decoder. Digital video contains spatial and temporal redundancies, which may be encoded without significant sacrifice. MPEG coding is a generic standard, intended to be independent of a specific application, involving encoding based on statistical redundancies in temporal and spatial directions. Spatial redundancy is based on the similarity between adjacent pixels. MPEG encoding employs intra-picture spatial compression to remove spatial redundancy by using DCT (Discrete Cosine Transform). Temporal redundancy refers to identical pixels repeatedly shown in adjacent video frames, providing smooth, realistic motion in video. MPEG relies on prediction, more precisely, motion-compensated prediction, for temporal encoding between frames. MPEG utilizes, to create temporal encoding, I-Frames, B-frames and P-frames. An I-frame is an intra-coded frame, a single image heading a sequence. I-frames are only encoded to reduce spatial redundancy within the frame with no reference to previous or subsequent frames. P-frames are forward-predicted frames, encoded with reference to a previous I- or P-frame, with pointers to information in a previous frame. B-frames are encoded with reference to a previous reference frame, a subsequent reference frame, or both. Motion vectors employed may be forward, backward, or both.
MPEG achieves encoding by quantizing the coefficients produced by applying DCT to 8×8 blocks of pixels in an image and through motion compensation. Quantization is basically division of the DCT coefficient by a quantization scale related to quality level, with higher indices for better encoding efficiency but lower quality, and lower indices for the reverse.
Typical approaches for a MPEG video encoder utilize a constant bit-rate (CBR) for a group of picture (GOP) regardless of the complexity of each video interval. Bit-rate is used to determine the video quality and defines how much physical space that one second of video requires in bits. CBR technique assumes equal weighting of bit distribution among GOPs which results in reducing the degree of freedom of the encoding task. The CBR encoding outputs a bitstream with a output rate kept at almost the same rate regardless of the content of the input video. As a result, for a video interval with simple content, the encoding quality will be good; however, for a video interval with complex content, the encoding quality will be poor. Generally speaking, the encoding quality of the CBR encoding is not smooth.
Since the VS is inherently variable, a better encoding approach has been introduced by employing a variable birate (VBR) encoder algorithm. Generally speaking, a VBR encoder produces non-constant output bit-rate during a period of time, and a complex frame consumes a higher bit-rate than that of a plain frame. As a result, the encoding quality of the VBR encoding is more smooth than that of a CBR encoding.