1. Field of the Invention
Methods and apparatuses consistent with the present invention relate to encoding/decoding an image, and more particularly, to encoding/decoding an image that divides an image sequence into sub-groups and determines encoding modes applied to bi-directional pictures included in each sub-group using correlations between the bi-directional pictures and reference pictures in order to improve encoding efficiency.
2. Description of the Related Art
When video is encoded, spatial redundancy and temporal redundancy of an image sequence are removed so as to compress the image sequence. To remove temporal redundancy, a reference picture that is a picture located to the front or rear of a currently encoded picture is used to search for an area of the reference picture similar to an area of the currently encoded picture. Then, an amount of motion between corresponding areas of the currently encoded picture and the reference picture are detected, and a residue between a prediction image obtained by the motion compensation based on the detected amount of motion and the currently encoded image is encoded.
Video standards such as Moving Picture Experts Group 1 (MPEG-1), MPEG-2, MPEG-4, H.264/Advanced Video Coding (AVC) and the like classify each picture of the image sequence into an I picture, a P picture, and a B picture according to a prediction encoding method. The I picture is encoded using information of the currently encoded picture itself without inter-prediction. The P picture is prediction-encoded referring to one previously processed picture located to the front or rear of the currently encoded picture. The B picture is prediction-encoded referring to two previously processed pictures located to the front or rear of the currently encoded picture.
FIGS. 1A through 1C illustrate a variety of encoding modes according to encoding of B pictures included in a group of pictures (GOP). Referring to FIG. 1A, video encoding that was suggested before the H.264/AVC standard was defined does not use the B pictures as reference pictures of other pictures, but uses I pictures or P pictures as reference pictures of another picture, which are called key pictures. In an encoding mode (hereinafter referred to as a “B picture non-reference mode”) where the B pictures are not used as the reference pictures of other pictures, the B pictures are prediction-encoded by using the I pictures or the P pictures that are located to the front or rear of the B pictures in terms of time and are previously processed as the reference picture. For example, the B picture B1 is prediction encoded using the I picture IO and the P picture P4, which are previously encoded and then restored during an encoding procedure.
Video standards such as H.264/AVC can use the B pictures as the reference picture of other pictures in order to improve encoding efficiency, since the reference picture can be controlled using a picture order count type parameter transmitted from a sequence parameter set (SPS). When the B pictures can be used as the reference pictures of other pictures, the encoding of the B pictures included in the GOP is divided into an encoding mode (hereinafter referred to as a “B picture reference mode”) where all the B pictures can be used as the reference pictures of other pictures, and an encoding mode (hereinafter referred to as a “pyramid mode”) where the B pictures having predetermined locations in the GOP are hierarchically prediction-encoded.
Referring to FIG. 1B, I pictures and P pictures and previously processed B pictures can also be used as reference pictures in the B picture reference mode. For example, a B picture B2 is prediction encoded using a P picture P4 and a B picture B1, which are first encoded and then restored. Although not shown, an I picture IO and the B picture B1 that are previously encoded and then restored can be used as the reference pictures of the B picture B2.
Referring to FIG. 1C, B pictures are prediction-encoded by using key pictures (I pictures or P pictures) to the front or rear of consecutive B pictures and a B picture in the center of the consecutive B pictures as reference pictures in the pyramid mode. For example, an I picture IO and a P picture P4 that are previously encoded and then restored are used as the reference pictures so as to prediction-encode a B picture B2 in the center of consecutive B pictures. The I picture IO and the B picture B2 are used as the reference pictures so as to prediction-encode a B picture B1. The B picture B2 and the P picture P4 are used as the reference pictures to prediction-encode a B picture B3.
The performance of the various encoding modes is dependent on the characteristics of an image sequence to be encoded. An encoding sequence of B pictures and reference pictures thereof vary according to each of the encoding modes, which causes differences between prediction images according to each of the encoding modes and between prediction errors of B pictures. In the related art, an image is encoded by applying one of the encoding modes to pictures included in a GOP, thus failing to adaptively encode the image sequence according to the characteristics of an image.