This section is intended to provide a background or context to the invention that is recited in the claims. The description herein may include concepts that could be pursued, but are not necessarily ones that have been previously conceived or pursued. Therefore, unless otherwise indicated herein, what is described in this section is not prior art to the description and claims in this application and is not admitted to be prior art by inclusion in this section.
With recent advances in video capture and display technologies, three-dimensional (3D) video communication and entertainment services are quickly becoming a reality and will enter into consumer domain in the near future. 3D video communication and entertainment services will likely revolutionize the way users enjoy and interact with content and will open the door for many new and exciting services. Such services may include stereoscopic video broadcasting, 3D-TV, and a wide variety of others.
3D-TV applications require special 3D displays whose characteristics can vary. For example, if a 3D display is auto-stereoscopic, then users do not need to wear any special lenses or glasses to see different views for each eye. Moreover, a 3D display could support several views to be displayed simultaneously so that user can see different views depending on the user's location with respect to the screen. This feature is referred to as head motion-parallax and is characterized as one of the most real and immersive experience available for users.
In order to enable these types of applications, new paradigms are required for the representation, transmission and rendering of 3D-video content, which could be very different from typical 2D-video use-cases. Recently, the Joint Video Team (JVT) of the ITU-T/VCEG and ISO/MPEG standardization groups undertook the effort for a Multi-View Video Coding (MVC) standard. MVC is an encoding framework of multi-view sequences which is produced either by a camera system comprising multiple cameras capturing the same event from different locations or by a single capable of capturing a 3D scene. In applications exploiting MVC, the user can enjoy real and immersive experiences, as the multi-view video represents a three dimensional scene in real space. The principle difference between a MVC coder and a traditional 2D-video coder is that MVC exploits inter-view redundancy, in addition to temporal and spatial redundancies. Therefore, in multi-view video coding structures, inter-view dependencies exist between pictures.
Unfortunately, inter-view dependencies between pictures pose serious complexity problems and parallelism issues to a video system. These issues occur because two pictures at different views need to be decoded sequentially. This is especially problematic for 3D-TV use-cases, where many views need to be displayed simultaneously. To understand these issues, it is helpful to consider a 3D-TV system that is simultaneously displaying two views, and where views are coded using the coding structure illustrated in FIG. 1. In order to decode a picture in “View-1” at any temporal instant, the picture in “View-0” at the same temporal instant should be decoded first. With this structure, the only way to display two views at the same time is to therefore have the decoder running two times faster than a regular video decoder. Even though two independent decoders running on different platforms might be available, both decoders need to run twice as fast as a regular 2D video decoder. The situation gets worse as the number of views supported by the 3D display increases. Currently, there are commercial displays which can display 100 views simultaneously, and if all of the views depend on each other, then the decoder must run 100 times faster.
In light of the above, it is clear that parallel decoding of separate views is a crucial requirement for 3D-TV systems. One solution for increasing the parallelism is to code each view independently. However, it has been learned that this kind of simulcasting approach results in a significant penalty in coding efficiency which is not desirable.
It would therefore be desirable to develop a system and method for improving the parallel decoding of separate views for 3D-TV systems and other systems while not suffering from the coding efficiency issues described above.