Embodiments of the present invention are related to a method to capture, record, store, distribute, and share panoramic image or video. Instead of using a single processed image (e.g equirectangular image) representing the full captured environment, this method proposes to use the original native images captured by the device to represent panoramic image or video. This method maintains the optimal image quality, allows higher image compression ratio, requires lower processing and power consumption on capture devices, lower processing power, enables real time streaming and real time immersive rendering on display devices and/or cloud servers.
Panoramic images or videos is commonly used to represent a wide field of view scene on a digital format. They are created by processing together multiple images coming from multiple camera modules or a single camera module, pointing in different directions. This processing combines stitching and image projection algorithms. Different projections, including cylindrical, equidistant cylindrical, equirectangular, cube map, pyramid map, and etc., are commonly used to represent the wide field of view image contents up to a 360°×360° field of view scene into a panoramic image as claim in European Patent No. EP 2031561 A1 and European Patent No. EP 1909226 B1. Currently, panoramic images or videos are stored, distributed, and shared in these specific projected formats as preprocessed images. Those projections transform the original image contents, adding extra processing in the pipeline compared with usual image on video processing pipeline embedded in narrow angle capture devices. Many panoramic image capture devices process the native image to create these projected panoramic images or videos, including Ricoh Theta, 360FLY, Samsung Gear 360, Allie cam, Nokia OZO, Giroptic and Kodak SP360. Some devices, like Ricoh Theta, also save original captured images of multiple cameras on the device when they record video. However these original images videos are always stored locally. When users decide to distribute and share the panoramic images or videos, these contents are processed, stitched and projected to the said projected format. U.S. Pat. Nos. 6,002,430 A and 6,795,113 B1 proposed to convert two hemispherical images as a ‘seamless spherical image’. Although these patents did not specify the method to create the 360°×360° image, they defined the image as a special format, and some special processing has to be applied to create the image from one or multiple images. Existing panoramic content or virtual reality (VR) content sharing websites and VR and immersive viewing application support as input format specific projection formats. For example, YouTube, Facebook, Deep Inc Liquid Cinema only support equirectangular projection. Some VR or panoramic player apps such as QuickTime VR, PT Viewer, KR Pano, ImmerVision Pure Player support more projection formats such as dome, equirectangular, cylinder, cube map, etc. Those formats are not the original image format captured by the devices (image projected by the lens on the sensor). These projection conversions include unnecessary image processing (projection conversion) on the capture device or application, and degrade the image quality of the original image. They are mainly used because those projections are the former formats developed since Marinus of Tyre and Ptolemy.
One common 360° panoramic capture device is back to back cameras as proposed by U.S. Pat. No. 6,002,430 A and miniaturized in U.S. Pat. No. 8,730,299 B. The device embeds two wide-angle lenses, with a field of view (FoV) larger than 180°, capturing front and back images, and each image contains about a half sphere FoV (˜180°×360°). These two images are captured from wide angle lenses, fisheye lenses, or panomorph lenses. The resulting panoramic image is created by stitching and projecting these two images together. Ricoh Theta, Samsung Gera 360 and Allie cam produce this kind of output. As mentioned before, a projected panoramic image, such as equirectangular projected image, is created from one or more images and then stored, distributed, and shared in this projected format. To create the projected panoramic image, the processing applied rearranges the pixels of the original images, modifies the distortions, and projects each original image pixels onto a resulting image pixel. This process includes some pixel manipulations and interpolations that degrade the image sharpness and original quality and modify the pixel density in certain areas. In some areas of the field of view, the pixels are stretched to cover more pixels in the resulting image than in the original image. This process does not create more information and reduces the image quality by pixel interpolation. In some other areas, the pixels are compressed compared to the original image, reducing the pixel density (pixel per FoV angle), then reducing the image resolution, sharpness and quality. For example, for an equirectangular projection, the nadir and zenith areas of the full spherical field of view are highly stretched, and there are extensive image content redundancies in these areas. Although an image compression algorithm can reduce some redundancies, image file size is still increased by this projection, and this is not preferred in some resource sensitive cases, such as network sharing and live streaming. In addition to the image quality deterioration, this projection processing consumes a significant amount processing power on the capture device or where device control application is running. On most panoramic capture devices, this projection cannot be done in real-time due to CPU, GPU and battery limitations. If this projection is done by post-processing, it prevents real-time display, live streaming or instantaneous sharing. Using distant server or cloud computing to perform projection processing could be an option but does not eliminate the latency and the image deterioration associated to this process and there are significant costs related to this type of cloud computing. US Patent Application Publication No. 2015/0281507 A1 proposes to automatically define the system behavior or user experience by recording, sharing, and processing information associated with an image. The patent application shows how to use markers on an image to record different information associated to the image or multiple images coming from a capture device (imager). This method provides a convenient way to synchronize, store, distribute, and share metadata with original image content. Although this patent application mentions that the multiple-marked image can be stitched together later, there is no specific disclosure of an efficient image assembly method described in that invention.
To overcome all the previously mentioned issues, embodiments of the current invention propose a method to capture, record, stream, share and display panoramic image or video by reducing as much as possible the image processing related to projection and stitching to maintain the image quality and optimize the full process to be executed in real time on low power devices.