1. Field of the Invention
Methods and apparatuses consistent with the present invention relate to encoding and decoding an image, and more particularly, to a method and apparatus for encoding and decoding an image, which can improve prediction efficiency of an image by performing motion prediction and compensation on pictures in a group of pictures by selectively using a high-quality key picture that is previously encoded and then restored, and using a previous picture that is previously encoded and then restored.
2. Description of the Related Art
During video encoding, compression is performed by removing spatial redundancy and temporal redundancy in an image sequence. In order to remove temporal redundancy, an area of a reference picture that is similar to an area of a currently encoded picture is searched for by using a picture located in front of or behind the currently encoded picture as the reference picture, a motion vector between the area of the currently encoded picture and the corresponding area of the reference picture is detected, and a residual between the currently encoded picture and a prediction picture obtained by performing a motion compensation process is encoded based on the detected motion vector.
According to the video standards, such as Motion Picture Experts Group-1 (MPEG-1), MPEG-2, MPEG-4, and H.264/AVC (advance video coding), each picture in an image sequence is classified into an I picture, a P picture, or a B picture according to a prediction encoding method. An I picture denotes a picture that is encoded by using only information in the picture without performing prediction encoding between pictures; a P picture denotes a picture that is prediction encoded by referring to one picture that is located in front of or behind the currently encoded picture and is already processed; and a B picture denotes a picture that is prediction encoded by referring to two pictures that are located in front of or behind the currently encoded picture and that are already processed.
Generally, each picture in an image sequence is classified into a group of pictures (GOP) formed of a predetermined number of pictures. According to a coding order, the first picture in each GOP is encoded as an I picture, and the remaining pictures in each GOP are encoded as a P picture or a B picture. For example, when a GOP is formed of 7 pictures, the pictures may be encoded as IPPPPPP or IBBPBBP. In case of the pictures encoded as IPPPPPP, the first I picture is intra prediction encoded by using only information in the I picture without referring to any other picture. The remaining P pictures are encoded by using a temporarily adjacent previous picture that is previously encoded and then restored. In case of the pictures encoded as IBBPBBP, the B pictures between the I picture and the P picture are bi-directional prediction encoded by using two reference pictures of the first I picture and the fourth P picture, and the B pictures between the fourth P picture and the seventh P picture are bi-directional encoded by using two reference pictures of the fourth P picture and the seventh P picture.
When the pictures in the GOP are encoded as IPPPPPP or IBBPBBP, the overall bit rate may be reduced since a correlation between temporally adjacent images can be high, but a predication error may accumulate, thereby reducing an overall peak signal to noise ratio (PSNR).