Adaptive Intra/Inter prediction is a coding technique widely used in the field of video coding, where a frame or an image unit is Intra coded or Inter coded. Before encoding, each video frame or image unit is assigned to a frame (or image unit) type for compressing. The types of frames (or image units) used for compression are Intra-frame or Intra coded frame (I-frame), Predicted frame (P-frame) and Bi-directional predicted frame (B-frame). An I-frame is an Intra-frame in which the image is coded into the data stream and can be reconstructed without referencing other frame or frames. During encoding, an I-frame is a key frame which can be used as a reference frame by other frames. P-frames or B-frames are Inter-frames which are predicted based on other frame or frames. A P-frame incorporates the changes in the image from its previous frame and therefore contains motion-compensated difference information from the preceding I-frame or P-frame. A B-frame contains difference information from the preceding and following I- or P-frames. In traditional video coding method, the frame type structure specifying the arranging order of Intra/Inter frames is settled before encoding. The frame type structure of a video sequence can be represented by a Group Of successive Pictures (GOP) with pre-determined frame types and reference information, wherein the reference information comprises the number of reference frame or frames, the reference direction and the reference distance. For example, the frame type structure for a conventional four pictures GOP may be IBBP, IBPB, IPPP or IIII.
After the frame type for each frame and the frame type structure for a video sequence are determined, the lambda table for each Inter-frame (such as P-frame or B-frame) is assigned. The λ values in the lambda table are used for mode decision in video coding. During the process of mode decision, the cost of video compression is estimated for each cost function so as to determine the best compression mode with the minimum cost. The video compression cost corresponding to a coding mode is calculated based on the distortion and the quantity of coded bits. The cost of video compression is calculated using the following equation:Cost=a*Distortion+λ*Bits  (1)in which ‘a’ is a constant value associated with the distortion and ‘λ’ is related to the quantity of coded bits. The λ value is calculated according to a given Quantization Parameter (QP) value. The λ value can be calculated by the following equation:λ=f(QP)  (2)
If a coding unit of an image is coded with a low QP, the distortion will be lower and the quantity of coded bits will be higher. If the coding unit is coded with a large QP, the quantity of the coded bits will be lower. The λ value plays an important role in mode decision of video coding. When the λ value is high, the cost of the video compression is more relied on the quantity of the coded bits. On the other hand, the cost of the video compression depends more on the distortion (or the image quality) if the λ value is low. Therefore, the λ value is a weighting factor between the distortion and the quantity of the coded bits. In traditional methods, the lambda table for each frame of a video sequence is determined in a sequence level. The lambda table is fixed once the frame type structure of the GOP is determined.
The traditional video coding method has some shortcomings with pre-determined frame type structure and fixed lambda table for video coding. In some situations, the pre-determined frame type structure with a fixed lambda table may not perform well in terms of video compression for all scenes. The pre-determined frame type structure or non-adaptive frame type video coding process may cause image quality degradation, particularly in fast motion scenes. The degradation is partially contributed by the manner that each picture is treated as a reference picture or non-reference picture depending on the picture type regardless of the image characteristics. For example, if the frame type structure is “IPPP”, each P-frame is predicted by its previous frame (i.e., an I-frame or a P-frame). In other words, each P-frame is coded as a reference picture except for the last P-frame in a group of pictures coded using “IPPP”. If a fast motion or a scene change occurs in the first P-frame in the sequence with “IPPP” coding structure, the first P-frame will require a higher bit-rate to code. Nevertheless, the conventional coding technique uses fixed coding structure and may not be able to use the available bit-rate budget efficiently. On the other hand, if high data rate is allocated to the first P-frame in this case, the system may not have enough bit-rate to allocate to subsequent P-frames. Therefore, the fixed frame type structure won't be able to adapt to the frame characteristics. The lambda table used in a conventional coding system does not take into account the frame characteristics either. Therefore, it could not adapt to frame characteristics to weigh more on the image quality (or distortion) or the bits.
For a picture that is referenced by another picture in the image sequence, the referenced picture has to be reconstructed and used as a reference picture by another picture in the encoder. Therefore, a referenced picture will consume more processing power and more memory bandwidth. However, in a conventional coding system, the required computational power and required system bandwidth are about fixed for a fixed frame type structure. It cannot adapt to the particular computational power or memory bandwidth of a given coding system. It is desirable to develop an adaptive method to determine frame type structure and lambda table for video coding to improve performance or to adapt to the system processing power or system bandwidth limitations.