Field of the Disclosure
The present disclosure relates generally to storing and/or presenting of image and/or video content and more particularly in one exemplary aspect to encoding, decoding, and/or transmission of panoramic video content.
Description of Related Art
Image and/or video content may be characterized by angle of view or field of view (FOV) (e.g., diagonal view angle of about 63° for 35-mm focal length FX format camera). Image and/or video content may be presented on a display that may be characterized by smaller view angle compared to the view angle of the captured content. Such captured content may be referred to as panoramic content wherein captured image dimensions (in pixels) may be greater than dimensions of the view window during content presentation. In some implementation, panoramic content characterized by full circle FOV may be referred to as 360° and/or spherical content.
360-degree and VR content video/image data usually involves very high resolution capture of images over a wide field of view. For a great experience, image resolution may be high (up to 8K resolution per eye). Current state of the art video compression codecs like H.264, HEVC and VP9 (by themselves) may not be well suited for encoding/decoding VR and/or panoramic content. Use of traditional codecs may prove impractical for delivering VR and/or panoramic content over Internet and/or mobile networks.
Current 360-degree and VR video delivery and decoding systems may employ a number of different techniques. For example, a decoding device may receive and decode the entire highest resolution native 360-degree image and keep it in memory. As the user moves their device, the decoder/renderer moves a cropped viewpoint to reflect where the viewer wants to look. This method has limitations, such as requiring the entire 360-degree image to be sent at the highest resolution (from server), which results in high bandwidth requirements. As a result, playback over the internet may result in buffering issues. Additionally, the decoding device has to have powerful processing capabilities to decode the highest resolution 360-degree image. Moreover, the processing burden can result in significant battery usage. As a result, only a limited amount of content can be consumed before the device has to be charged.
In another example, the server sends (and the decoder decodes) only partial high resolution video. The area where the user is looking is rendered in high resolution and the rest of the image is rendered in low resolution. When the viewer moves his/her viewport, the decoder asks server to transmit video data corresponding to updated viewpoint. In this case, the server has to transmit an intra-frame in order to decode the current frame, or the decoder has to receive and decode all reference frames leading up to the last intra-frame. Both approaches have their own set of limitations: transmitting an intra frame can lead to network congestion because intra-frames are usually much larger (compared to inter-frames). Having the decoder receive and decode all prior reference frames in a closed group of pictures (GOP) will increase latency when updating the new image to a high resolution. This may also cause high bandwidth utilization.
Within this context, possible areas for improvement may leverage the limited viewing aspect; e.g., a viewer does not see the entire 360-degree world simultaneously. New algorithms are needed that minimize latency when the user moves his/her viewpoint, while still achieving high compression and low battery performance. Furthermore, ideal solutions would modify the encoding process to reuse existing hardware decoders (and not require special new hardware at the consumption side).
Panoramic (e.g., 360°) content may be viewed on a resource-restricted device (e.g., smartphone, tablet, and/or other device that may be characterized by a given amount of available energy, data transmission bandwidth, and/or computational capacity). Resources available to such resource-limited device may prove inadequate for receiving and/or decoding full resolution and/or full frame image content.