As disclosed in Non-Patent Literature 1, an international standard MVC which is an extension of Advanced Video Coding (AVC) has been standardized as Multiview Video Coding. Also, as described in Non-Patent Literature 2, Multi-View High Efficiency Video Coding (MV-HEVC) which is an extension of High Efficiency Video Coding (HEVC) is in the process of standardization as an international standard.
Further, in addition to the above, an encoding system targeting a multi-view video image (textures) and depth maps has been studied as described in Non-Patent Literature 3.
FIG. 10 is a view conceptually showing a data configuration of said multi-view video image with associated depth maps targeted for encoding. As illustrated, textures 1 to n are provided as images shot at respective viewpoints 1 to n corresponding to respective camera positions, depth maps 1 to n are prepared for the textures 1 to n, respectively, and such data is prepared at respective times t.
Textures i (i=1, 2, . . . , n) at respective pixel locations (x, y) of which the values d(x, y) of depths of an object that is being shot from a camera that is performing that shooting are provided serve as depth maps i (i=1, 2, . . . , n). Such textures and depth maps can be used variously for generation of a video image at an arbitrary viewpoint other than the respective viewpoints 1 to n and other purposes.
In addition, depth maps are prepared by an existing method. For example, using a positional relationship between the respective viewpoints and respective camera parameters, etc., and then performing a processing such as association of identical characteristic points between the respective textures i allows for preparing the respective depth maps i.
In an encoding system targeting said textures i and the depth maps i at the respective viewpoints i (i=1, 2, . . . , n), as described in Non-Patent Literature 3, the following methods can be applied. First, for the textures, an ordinary video encoding system (system in Non-Patent Literatures 1, 2, and the like) constituted of intra-prediction, motion compensation, and transformation/quantization of a prediction residue can be used.
On the other hand, because the depth maps can also be regarded as “images” having depth values as pixel values, the above-described video encoding system (ordinary system) identical to that for the textures can be applied thereto. Further, in place of said ordinary system, or in combination with said ordinary system, depth intra-prediction dedicated to the depth maps may also be used.
In addition, here, the depth maps have signal characteristics that are greatly different from those of the textures, and therefore, dedicated depth intra-prediction is prepared therefor. One of the signal characteristics of the depth maps is that a sharp edge occurs because of a difference in depth at an object boundary, while an object surface has relatively small changes in depth.
That is, as in a case such as with a “person” as a first object that exists at a near side and a “wall” as a second object that exists behind the person as a background, there is a characteristic that average depths within objects are greatly different if the objects are different.
Hereinafter, a method for depth intra-prediction described in Non-Patent Literature 3 will be described. Said method consists of the following [procedure 1] to [procedure 4]. FIG. 11 includes views for describing said method.
[Procedure 1] Prepare wedge list
[Procedure 2] Calculate average value for every region
[Procedure 3] Determine wedgelet by SAD calculations
[Procedure 4] Generate depth value prediction image
In [procedure 1], a wedge list listing a plurality of wedgelets to serve as search targets is prepared. The wedgelets are object boundaries (such as, for example, a boundary between a person and a wall being a background) in a block within a depth map that have been modeled as straight lines.
In FIG. 11(1), a block B0 serving as a prediction target is shown, and from four sides L1 to L4 of said block B0, points (points of pixel locations) that belong to two different sides are selected and connected to be a line segment, which serves as a wedgelet. As shown in, for example, FIG. 11(2), a line segment W1 connecting a point P1 on the side L4 and a point P2 on the side L3 serves as a wedgelet, and by said wedgelet W1, the block B0 is partitioned into regions R1 and R2.
By enumerating as candidates all possible wedgelets each constituted as said line segment connecting two points on two different sides, a wedge list is prepared in [procedure 1].
In [procedure 2], average values of texture signals (or depth signals) are determined in the two partitioned regions, respectively, in terms of each of the wedgelets in the wedge list. For example, in terms of the wedgelet W1 in FIG. 11(2), an average value m(R1) and an average value m(R2) of the partitioned region R1 and region R2 are determined, respectively.
In [procedure 3], first, sums of absolute differences (SADs) of the average values calculated in [procedure 2] for the two partitioned regions and texture signal values (or depth signal values) are determined in terms of each of the wedgelets in the wedge list, and a SAD with that wedgelet is determined for the block as a whole of a prediction target as a sum of said SADs in the two regions.
For example, a SAD[W1] with the wedgelet W1 in FIG. 11(2) is provided by the following (Expression 1) where a signal value at a location (x, y) in the block B0 is provided as s(x, y). (The signal value is a texture or depth signal value.)
                                              ⁢                  [                      Equation            ⁢                                                  ⁢            1                    ]                                                                              S          ⁢                                          ⁢          A          ⁢                                          ⁢                      D            ⁡                          [                              W                ⁢                                                                  ⁢                1                            ]                                      =                                            ∑                                                (                                      x                    ,                    y                                    )                                ∈                                  R                  ⁢                                                                          ⁢                  1                                                      ⁢                                                                          s                  ⁡                                      (                                          x                      ,                      y                                        )                                                  -                                  m                  ⁡                                      (                                          R                      ⁢                                                                                          ⁢                      1                                        )                                                                                              +                                    ∑                                                (                                      x                    ,                    y                                    )                                ∈                                  R                  ⁢                                                                          ⁢                  2                                                      ⁢                                                                          s                  ⁡                                      (                                          x                      ,                      y                                        )                                                  -                                  m                  ⁡                                      (                                          R                      ⁢                                                                                          ⁢                      2                                        )                                                                                                                        (                  Expression          ⁢                                          ⁢          1                )            
In [procedure 3], further, such a wedgelet Wi that a SAD[Wi] calculated in the same manner as the above (Expression 1) in terms of each of the wedgelets Wi (i=1, 2, . . . , N) in the wedge list is minimized is determined as one to be used for prediction.
In [procedure 4], a prediction image of depth values with that prediction target block is generated by the wedgelet Wi determined as one to provide the minimum value in the above [procedure 3]. Said prediction image is generated as one in two regions of which partitioned by the wedgelet Wi their respective representative values are provided (by respective methods of an encoder/decoder). For the representative values, average values of the respective regions can be typically used, but other values calculated in said regions may be adopted.
In addition, the prediction “image” of depth values (depth signals) is called an “image” because of the point that predicted depth values are mapped to pixel positions. Alternatively, this may called a prediction depth map of depth values.
For example, when the wedgelet W1 in FIG. 11 is determined as one to provide the minimum value, a prediction image of the block B0 is an image having signal values at all locations in the region R1 that correspond to its representative value dR1 and having signal values at all locations in the region R2 that correspond to its representative value dR2.
As is apparent from the above [procedure 1] to [procedure 4], depth intra-prediction by a wedgelet is a prediction method suitable for such a case that, in a block of a prediction target, an object boundary such as to cross said block almost linearly exists, and the depth value sharply changes at said boundary.