1. Field
The invention relates to multimedia signal processing and, more particularly, to video encoding and decoding.
2. Background
Multimedia processing systems, such as video encoders, may encode multimedia data using encoding methods based on international standards such as MPEG-x and H.26x standards. Such encoding methods generally are directed to compressing the multimedia data for transmission and/or storage. Compression is broadly the process of removing redundancy from the data.
A video signal may be described in terms of a sequence of pictures, which include frames (an entire picture), or fields (e.g., an interlaced video stream comprises fields of alternating odd or even lines of a picture). As used herein, the term “frame” is broadly used to refer to a picture, a frame, or a field. Multimedia processors, such as video encoders, may encode a frame by partitioning it into blocks or “macroblocks” of, for example, 16×16 pixels. The encoder may further partition each macroblock into subblocks. Each subblock may further comprise additional subblocks. For example, subblocks of a macroblock may include 16×8 and 8×16 subblocks. Subblocks of the 8×16 subblocks may include 8×8 subblocks, and so forth. As used herein, the term “block” refers to either a macroblock or a subblock.
Video encoding methods compress video signals by using lossless or lossy compression algorithms to compress each frame or blocks of the frame. Intra-frame coding refers to encoding a frame using data from that frame. Inter-frame coding refers to predictive encoding schemes such as schemes that comprise encoding a frame based on other, “reference,” frames. For example, video signals often exhibit temporal redundancy in which frames near each other in the temporal sequence of frames have at least portions that match or at least partially match each other. Encoders can take advantage of this temporal redundancy to reduce the size of encoded data.
Encoders may take advantage of this temporal redundancy by encoding a frame in terms of the difference between the frame and one or more reference frames. For example, video encoders may use motion compensation based algorithms that match blocks of the frame being encoded to portions of one or more other frames. The block of the encoded frame may be shifted in the frame relative to the matching portion of the reference frame. This shift is characterized by a motion vector. Any differences between the block and partially matching portion of the reference frame may be characterized in terms of a residual. The encoder may thus encode a frame as data that comprises one or more of the motion vectors and residuals for a particular partitioning of the frame. A particular partition of blocks for encoding the frame may be selected by approximately minimizing a cost function that, for example, balances encoding size with distortion to the content of the frame resulting from an encoding.
Reference frames may include one or more prior frames of the video signal or one or more frames that follow the frame in the video signal in terms of output order. The H.264 standard, for example, includes a configuration that uses five reference frames in searching for the best matching block. In general, searching of more reference frames increases the ability of the encoder to find portions of one of the reference frames that closely matches the block of the frame being encoded. Better matches have a smaller difference to encode, which generally results in a more compact encoding. However, encoding such matches may still require a significant amount of bandwidth. Thus, a need exists for better ways of encoding video data.