The 360-degree video, also known as immersive video is an emerging technology, which can provide “feeling as sensation of present”. The sense of immersion is achieved by surrounding a user with wrap-around scene covering a panoramic view, in particular, 360-degree field of view. The “feeling as sensation of present” can be further improved by stereographic rendering. Accordingly, the panoramic video is being widely used in Virtual Reality (VR) applications.
Immersive video involves the capturing a scene using multiple cameras to cover a panoramic view, such as 360-degree field of view. The immersive camera usually uses a set of cameras, arranged to capture 360-degree field of view. Typically, two or more cameras are used for the immersive camera. All videos must be taken simultaneously and separate fragments (also called separate perspectives) of the scene are recorded. Furthermore, the set of cameras are often arranged to capture views horizontally, while other arrangements of the cameras are possible.
FIG. 1 illustrates an exemplary processing chain for 360-degree spherical panoramic pictures. The 360-degree spherical panoramic pictures may be captured using a 360-degree spherical panoramic camera. Spherical image processing unit 110 accepts the raw image data from the camera to form 360-degree spherical panoramic pictures. The spherical image processing may include image stitching and camera calibration. The spherical image processing are known in the field and the details are omitted in this disclosure. An example of 360-degree spherical panoramic picture from the spherical image processing unit 110 is shown in picture 112. The top side of the 360-degree spherical panoramic picture corresponds to the vertical top (or sky) and the bottom side points to ground if the camera is oriented so that the top points up. However, if the camera is equipped with a gyro, the vertical top side can always be determined regardless how the camera is oriented. In the 360-degree spherical panoramic format, the contents in the scene appear to be distorted. Often, the spherical format is projected to the surfaces of a cube as an alternative 360-degree format. The conversion can be performed by a projection conversion unit 120 to derive the six face images 122 corresponding to the six faces of a cube. On the faces of the cube, these six images are connected at the edges of the cube. Since the 360-degree image sequences may require large storage space or require high bandwidth for transmission, video encoding by a video encoder 130 may be applied to the video sequence consisting of a sequence of six-face images. At a receiver side or display side, the compressed video data is decoded using a video decoder 140 to recover the sequence of six-face images for display on a display device 150 (e.g. a VR (virtual reality) display).
FIG. 2A illustrates an example of projection conversion, where the spherical picture is projected onto the six faces of a cube. The six faces of the cube are numbered from 1 to 6. The three visible sides 210 (i.e., 1, 4 and 5) and three invisible sides 220 are shown in FIG. 2A. The orientation of each side is indicated by its corresponding side number. The side numbers in dashed circle indicate see-through images since the images are on the back sides of the cube. These six cubic faces are continuous from one face to a connected face at the connection edge. For example, face 1 is connected to face 5 at edge 214. Therefore the top edge of face 1 extends continuously into the bottom edge of face 5 as shown in FIG. 2B. In another example, face 4 is connected to right side of face 5 at edge 212. Therefore the top edge of face 4 extends continuously into the right side of face 5 as shown in FIG. 2C. A thin gap between face 1 and face 5 and between face 4 and face 5 is intended to illustrate the image boundary between two faces.
In order to allow an image processing system or a video processing system to exploit spatial and/or temporal correlation or redundancy between the six cubic faces, it is desirable to develop method to assemble these six cubic faces into an assembled rectangular image for efficient processing or compression.