The 360-degree video, also known as immersive video is an emerging technology, which can provide “feeling as sensation of present”. The sense of immersion is achieved by surrounding a user with wrap-around scene covering a panoramic view, in particular, 360-degree field of view. The “feeling as sensation of present” can be further improved by stereographic rendering. Accordingly, the panoramic video is being widely used in Virtual Reality (VR) applications.
Immersive video involves the capturing of a scene using multiple cameras to cover a panoramic view, such as 360-degree field of view. The immersive camera usually uses a panoramic camera or a set of cameras arranged to capture 360-degree field of view. Typically, two or more cameras are used for the immersive camera. All videos must be taken simultaneously and separate fragments (also called separate perspectives) of the scene are recorded. Furthermore, the set of cameras are often arranged to capture views horizontally, while other arrangements of the cameras are possible.
The 360-degree virtual reality (VR) images may be captured using a 360-degree spherical panoramic camera or multiple images arranged to cover all field of views around 360 degrees. The three-dimensional (3D) spherical image is difficult to process or store using the conventional image/video processing devices. Therefore, the 360-degree VR images are often converted to a two-dimensional (2D) format using a 3D-to-2D projection method. For example, equirectangular projection (ERP) and cubemap projection (CMP) have been commonly used projection methods. Accordingly, a 360-degree image can be stored in an equirectangular projected format. The equirectangular projection maps the entire surface of a sphere onto a flat image. The vertical axis is latitude and the horizontal axis is longitude. FIG. 1A illustrates an example of projecting a sphere 110 into a rectangular image 120 according to equirectangular projection, where each longitude line is mapped to a vertical line of the ERP picture. FIG. 1B illustrates an example of ERP picture 130. For the ERP projection, the areas in the north and south poles of the sphere are stretched more severely (i.e., from a single point to a line) than areas near the equator. Furthermore, due to distortions introduced by the stretching, especially near the two poles, predictive coding tools often fail to make good prediction, causing reduction in coding efficiency. FIG. 2 illustrates a cube 210 with six faces, where a 360-degree virtual reality (VR) image can be projected to the six faces on the cube according to cubemap projection. There are various ways to lift the six faces off the cube and repack them into a rectangular picture. The example shown in FIG. 2 divides the six faces into two parts (220a and 220b), where each part consists of three connected faces. The two parts can be unfolded into two strips (230a and 230b), where each strip corresponds to a continuous picture. The two strips can be joined to form a rectangular picture 240 according to one CMP layout as shown in FIG. 2. However, the layout is not very efficient since some blank areas exist. Accordingly, a compact layout 250 is used, where a boundary 252 is indicated between the two strips (250a and 250b). However, the picture contents are continuous within each strip.
Besides the ERP and CMP formats, there are various other VR projection formats, such as octahedron projection (OHP), icosahedron projection (ISP), segmented sphere projection (SSP) and rotated sphere projection (RSP), that are widely used in the field.
FIG. 3A illustrates an example of octahedron projection (OHP), where a sphere is projected onto faces of an 8-face octahedron 310. The eight faces 320 lifted from the octahedron 310 can be converted to an intermediate format 330 by cutting open the face edge between faces 1 and 5 and rotating faces 1 and 5 to connect to faces 2 and 6 respectively, and applying a similar process to faces 3 and 7. The intermediate format can be packed into a rectangular picture 340. FIG. 3B illustrates an example of octahedron projection (OHP) picture 350, where discontinuous face edges 352 and 354 are indicated. As shown in layout format 340, discontinuous face edges 352 and 354 correspond to the shared face edge between face 1 and face 5 as shown in layout 320.
FIG. 4A illustrates an example of icosahedron projection (ISP), where a sphere is projected onto faces of a 20-face icosahedron 410. The twenty faces 420 from the icosahedron 410 can be packed into a rectangular picture 430 (referred as a projection layout), where the discontinuous face edges are indicated by thick dashed lines 432. An example of the converted rectangular picture 440 via the ISP is shown in FIG. 4B, where the discontinuous face boundaries are indicated by white dashed lines 442.
Segmented sphere projection (SSP) has been disclosed in JVET-E0025 (Zhang et al., “AHG8: Segmented Sphere Projection for 360-degree video”, Joint Video Exploration Team (WET) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, 5th Meeting: Geneva, CH, 12-20 Jan. 2017, Document: WET-E0025) as a method to convert a spherical image into an SSP format. FIG. 5A illustrates an example of segmented sphere projection, where a spherical image 500 is mapped into a North Pole image 510, a South Pole image 520 and an equatorial segment image 530. The boundaries of 3 segments correspond to latitudes 45° N (502) and 45° S (504), where 0° corresponds to the equator (506). The North and South Poles are mapped into 2 circular areas (i.e., 510 and 520), and the projection of the equatorial segment can be the same as ERP or equal-area projection (EAP). The diameter of the circle is equal to the width of the equatorial segments because both Pole segments and equatorial segment have a 90° latitude span. The North Pole image 510, South Pole image 520 and the equatorial segment image 530 can be packed into a rectangular image 540 as shown in an example in FIG. 5B, where discontinuous boundaries 542, 544 and 546 between different segments are indicated.
FIG. 5C illustrates an example of rotated sphere projection (RSP), where the sphere 550 is partitioned into a middle 270°×90° region 552, and a residual part 554. These two parts of RSP can be further stretched on the top side and the bottom side to generate a deformed part 556 having oval-shaped boundaries 557 and 558 on the top part and bottom part as indicated by the dashed lines. FIG. 5D illustrates an example of RSP picture 560, where discontinuous boundaries 562 and 564 between two rotated segments are indicated by dashed lines.
Since the images or video associated with virtual reality may take a lot of space to store or a lot of bandwidth to transmit, therefore image/video compression is often used to reduce the required storage space or transmission bandwidth. However, when the three-dimensional (3D) virtual reality image is converted to a two-dimensional (2D) picture, some boundaries between faces may exist in the packed pictures via various projection methods. For example, a horizontal boundary 252 exists in the middle of the converted picture 250 according to the CMP in FIG. 2. Boundaries between faces also exist in converted pictures by other projection methods as shown in FIG. 3 through FIG. 5. As is known in the field, image/video coding usually results in some distortions between the original image/video and reconstructed image/video, which manifest visible artifacts in the reconstructed image/video.
FIG. 6A illustrates an example of artifacts in a reconstructed 3D picture on a sphere for the ERP. An original 3D sphere image 610 is projected to a 2D frame 620 for compression, which may introduce artifacts. The reconstructed 2D frame is projected back to a 3D sphere image 630. In this example, the picture contents are continuous from the left edge to the right edge. However, the video compression technique used usually disregards this fact. When the two edges are projected back to a 3D sphere image, the discontinuity at the seam corresponding to the two edges may become noticeable as indicated by the line with crosses 632. FIG. 6B illustrates an example of a visible artifact as indicated by arrows at a seam of discontinuous boundary. When this seam is projected to a 2D ERP frame, the artifact will be noticeable when the seam is projected a non-boundary part of the 2D ERP frame. For other projections, one or more discontinuous boundaries exist within the 2D frame.
Therefore, it is desirable to develop methods that can alleviate the visibility of artifacts at a seam of discontinuous boundary.