Video encoding is a process to encode a dynamic picture and digitize an analog picture signal. The video encoding process can realize picture band compression, reduce or eliminate information redundancy between digital pictures, and make video transmission channel capacity smaller than the channel capacity during analog transmission.
The video encoding is achieved by a video encoder generally. Traditional video encoders usually categorize coding frames (e.g., for encoding and decoding) into three types: I Frame, P Frame and B Frame. I Frame is a frame type specified in video encoding and decoding standards. I Frame employs an intra-frame prediction encoding mode, and data of a picture is completely preserved. I Frame is an independent individual frame with all the data of the picture. During a decoding process, I Frame can be independently decoded into a single frame of picture, and the decoding process of an I Frame is independent of other frames.
P Frame is a forward prediction frame. P Frame does not include data of a complete picture, but includes a difference between the P Frame and a preceding I Frame or a preceding P Frame. During a decoding process, a final picture is generated by superimposing the preceding I Frame or the preceding P Frame with the current P Frame.
B Frame is a bi-directional difference frame, and records differences between the current B Frame and preceding and subsequent frames. During a decoding process, not only a preceding picture but also a subsequent picture are obtained, and superimposed with the differences between the current B Frame and preceding and subsequent frames to obtain a final picture.
FIG. 1 is an example diagram showing a reference relationship among I Frame, P Frame and B Frame. “I” represents I Frame, “B” represents B Frame, and “P” represents P Frame. I Frame may be decoded into a complete picture by itself, without reference to any other frames. P Frame is a forward prediction frame, and refers to a preceding B Frame or a preceding I Frame. B Frame is a bi-directional difference frame, and refers to a preceding frame and a subsequent frame. Thus, a P Frame and a B Frame both need to refer to other frames, and have dependency on other frames. A P Frame or a B Frame cannot be decoded into a complete picture solely by itself. Usually, an I Frame together with a P frame and a related B frame are jointly referred to as a Group of Pictures (GOP).
A picture frame in a video encoding stream may be accessed randomly. An I Frame in a GOP that includes a target frame is located, and then all I Frames, P Frames and B Frames before the target frame are decoded sequentially to achieve random access of the target frame. However, a large number of coding frames often exist in the video encoding stream. Particularly, to decode P Frames and/or B Frames, other frames are often referred to. Thus, to achieve random access of a target frame, a large number of coding frames may have to be decoded, so that cost of decoding is very high and the decoding efficiency is low.
Hence it is highly desirable to improve the techniques for video coding.