1. Field of the Invention
Apparatuses and methods consistent with the present invention relate to encoding and decoding a multi-view video, and more particularly, to encoding and decoding the multi-view video, in order to improve compressibility of the multi-view video.
2. Description of the Related Art
The new H.264 video coding standard is noted for high encoding efficiency compared to conventional standards. The new H.264 video coding standard depends on various new characteristics, such as considering a variable block size between 16×16 and 4×4, a quadtree structure for motion compensation in a loop deblocking filter, a multi-reference frame, intra prediction, and context adaptability entropy coding, as well as considering general bi-directional (B) estimation slices. Unlike the MPEG-2 standard, the MPEG-4 Part 2 standard, etc., the B slices can refer to different slices while using multi-prediction obtained from the same direction (forward or backward). However, the above-described characteristics require a large amount of data for motion information including a motion vector and/or reference picture in an estimation mode for the H.264 video coding standard.
In order to relieve this problem, a skip mode and a direct mode are respectively introduced into predicted (P) slices and B slices. The skip and direct modes allow motion estimation of an arbitrary block of a picture to be currently encoded, using motion vector information previously encoded. Accordingly, additional motion data for macroblocks (MBs) or blocks is not encoded. Motions for these modes are obtained using spatial (skip) or temporal (direct) correlation of motions of adjacent MBs or pictures.
FIG. 1 is a view for explaining a direct mode of a B picture.
The direct mode is to obtain a forward motion vector and a backward motion vector using a motion vector of a co-located block of a temporally following P picture, when estimating a motion of an arbitrary block of a B picture to be currently encoded.
In order to calculate a forward motion vector MVLO and a backward motion vector MVL1 of a direct mode block 102 whose motion will be estimated in a B picture 110, a motion vector MV for a reference list 0 image 130, which a co-located block 104 which is at the same position as the direct mode block 102 in a reference list 1 picture 120 as a temporally following picture refers by a motion vector, is detected. Thus, the forward motion vector MVL0 and the backward motion vector MVL1 of the direct mode block 102 of the B picture 110 are calculated using the following Equations 1 and 2.
                                          MV            ⇀                                L            ⁢                                                  ⁢            0                          =                                            TR              B                                      TR              D                                ×                      MV            ⇀                                              (        1        )                                                      MV            ⇀                                L            ⁢                                                  ⁢            1                          =                                            (                                                TR                  B                                -                                  TR                  D                                            )                                      TR              D                                ×                      MV            ⇀                                              (        2        )            where MV represents the motion vector of the co-located block 104 of the reference list 1 picture 120, TRD represents a distance between the reference list 0 picture 130 and the reference list 1 picture 120, and TRB represents a distance between the B picture 110 and the reference list 0 picture 130.
FIG. 2 is a view for explaining a method of estimating a motion vector in a spatial area.
According to the H.264 standard used for encoding video data, a frame is divided into blocks, each having a predetermined size, and a most similar block is searched with reference to an adjacent frame(s) subjected to encoding. That is, an intermediate value of motion vectors of a left lower macroblock 4, an upper middle macroblock 2, and an upper right macroblock 3 of a current macroblock c is determined as an estimation value of the corresponding motion vector. The motion vector estimation can be expressed by Equation 3.
                    {                                                            pmvx                =                                  MEDIAN                  ⁢                                                                          ⁢                                      (                                                                  mvx                        ⁢                                                                                                  ⁢                        2                                            ,                                              mvx                        ⁢                                                                                                  ⁢                        3                                            ,                                              mvx                        ⁢                                                                                                  ⁢                        4                                                              )                                                                                                                          pmvy                =                                  MEDIAN                  ⁢                                                                          ⁢                                      (                                                                  mvy                        ⁢                                                                                                  ⁢                        2                                            ,                                              mvy                        ⁢                                                                                                  ⁢                        3                                            ,                                              mvy                        ⁢                                                                                                  ⁢                        4                                                              )                                                                                                          (        3        )            
As such, a method of encoding a video using spatial correlation as well as temporal correlation has been proposed. However, a method of enhancing the compressibility and processing speed of a multi-view video having a significantly greater amount of information than a general video, is still required.