1. Field of the Invention
The invention herein relates to data compression of light field imaging information used by light field electronic displays for the display of ultra-high resolution 3D images utilizing techniques, such as holography, integral imaging, stereoscopy, multi-view imaging, video and the like. The invention has unique application to light field displays having common, industry-standard interfaces, such as HDMI, Displayport, MIPI, etc., for which data transfer bandwidth of the imaging information into the light field displays is known to be challenging.
2. Prior Art
In prior art light fields, neighboring hogels exhibit similar anglet data. The hogel appears to the viewer as a single point source, which could be implemented as a single lens of a micro-lens array above the light field display pixels (see U.S. Pat. No. 8,928,969). The reproduced 3D image, also known as a light field frame, consists of the complete set of hogels generated by the light field display. A light field video consists of a time-sequence of light field frames. Typically, an application processor pre-processes the input light field image data, such as real images acquired by cameras and/or rendered computer-generated images, and transfers the data to light field displays. In order to provide the necessary bandwidth between the application processor and light field displays having common interfaces currently available, such as HDMI, Displayport, MIPI, etc., the input signal is divided among several interfaces, which is cumbersome if not, in fact, infeasible due to data size limitations.
Data compression prior to transmission is employed to cope with the extreme volume of light field image data used by light field displays. Recently published methods for light field compression, such as the ones in Magnor, M. and Girod, B. “Data Compression for Light-Field Rendering,” IEEE Trans. on Circuits and Systems for Video Technology, 10(3), 338-343 (2000) and Conti, C.; Lino, J.; Nunes, P.; Soares, L. D.; Lobato Correia, P., “Spatial prediction based on self-similarity compensation for 3D holoscopic image and video coding,” in Image Processing (ICIP), 2011 18th IEEE International Conference on, vol., no., pp. 961-964, 11-14 Sep. 2011, follow the usual approach of prediction, transformation and residue quantization, similar to the methods adopted by prior art 3D video coding standards (Ohm, J.-R., “Overview of 3D video coding standardization,” In International Conference on 3D Systems and Applications, Osaka, 2013). The drawback of these compression approaches is that they process the incoming data in frame buffers, which become extremely large when compressing high-resolution (and thus high volume) data, and necessarily introduce undesirable video latency for real-time display applications.
Another prior art solution for light field data compression is to “sub-sample” the views in the image generation procedure and reconstruct the suppressed views directly at the light field display. For example, in Yan, P.; Xianyuan, Y., “Integral image compression based on optical characteristic,” Computer Vision, IET, vol. 5, no. 3, pp. 164, 168, May 2011 and Yan Piao; Xiaoyuan Yan, “Sub-sampling elemental images for integral imaging compression,” Audio Language and Image Processing (ICALIP), 2010 International Conference on, vol., no., pp. 1164, 1168, 23-25 Nov. 2010, the light field is sub-sampled based on the optical characteristics of the display system. A formal approach to light field sampling is described in Jin-Xiang Chai, Xin Tong, Shing-Chow Chan, and Heung-Yeung Shum. 2000. Plenoptic sampling. In Proceedings of the 27th annual conference on Computer graphics and interactive techniques (SIGGRAPH '00) and Gilliam, C.; Dragotti, P. L.; Brookes, M., “Adaptive plenoptic sampling,” Image Processing (ICIP), 2011 18th IEEE International Conference on, vol., no., pp. 2581, 2584, 11-14 Sep. 2011. Although these prior art methods provide a significant reduction in bit rates, the compression rate is undesirably highly content-dependent. Moreover, these methods usually rely on complicated view synthesis algorithms (for example, see the works in Graziosi et al, “Methods For Full Parallax Compressed Light Field 3D Imaging Systems”, United States Provisional Patent Application No. 20150201176 A1, published Jul. 16, 2015, “View Synthesis Reference Software (VSRS) 3.5,” wg11.sc29.org, March 2010, C. Fehn, “3D-TV Using Depth-Image-Based Rendering (DIBR),” in Proceedings of Picture Coding Symposium, San Francisco, Calif., USA, December 2004, Mori Y, Fukushima N, Yendo T, Fujii T, Tanimoto M (2009) View generation with 3D warping using depth information for FTV. Sig Processing: Image Commun 24(1-2):65-72 and Tian D, Lai P, Lopez P, Gomila C (2009) View synthesis techniques for 3D video. In: Proceedings applications of digital image processing XXXII, Vol. 7443, pp 74430T-1-11) requiring very large frame buffers, floating-point logic units, and several memory transfers. Thus, sub-sampling solutions require considerable display device computational resources Bhaskaran, V. “65.1: invited Paper: Image/Video Compression—A Display Centric Viewpoint,” SID Symposium Digest of Technical Papers, vol. 39, no. 1, 2008.
Some compression methods have been developed specifically for stereoscopic video displays. For example, frame-compatible encoding methods for left and right views are described in Vetro, A.; Wiegand, T.; Sullivan, G. J., “Overview of the Stereo and Multiview Video Coding Extensions of the H.264/MPEG-4 AVC Standard,” in Proceedings of the IEEE, vol. 99, no. 4, pp. 626-642, April 2011. These methods encode 3D stereoscopic video by down-sampling the video via bundling two contiguous frames into one new frame, either temporally or spatially (horizontally or vertically). Examples of frame-packing include side-by-side, where two frames are horizontally down-sampled and arranged next to each other, and top-bottom frame packing, where the two frames are vertically down-sampled and arranged on top of each other. By bundling two frames into one, the rate is reduced by half. Another advantage of this approach is that the decoding method is a very simple view reconstruction that can be implemented directly at the stereoscopic display. However, these encoding methods always perform the same data sub-sampling regardless of the image content, which results in less than optimal image quality.
In Graziosi, D. B., Alpaslan, Z. Y. And El-Ghoroury, H. S., “Compression for full-parallax light field displays”, Proceedings of SPIE-IS&T Electronic Imaging, 9011, (2014), Graziosi, D. B., Alpaslan, Z. Y. And El-Ghoroury, H. S., “Depth assisted compression of full parallax light fields”, Proceedings of SPIE-IS&T Electronic Imaging, 9011, (2015) and Graziosi et al, “Methods For Full Parallax Compressed Light Field 3D Imaging Systems”, United States Patent Application Publication No. 2015/0201176 A1, a more sophisticated method for light field compression is described. The prior art compression method therein analyzes the composition of the entire light field scene and selects a subset of hogels from among all the hogels associated with the light field for transmission to the light field display, wherein the suppressed hogels are generated from the received hogels. To achieve even higher compression ratios, the prior art compression methods adopt transform and entropy encoding. The Graziosi, D. B., Alpaslan, Z. Y. And El-Ghoroury, H. S., “Compression for full-parallax light field displays”, Proceedings of SPIE-IS&T Electronic Imaging, 9011, (2014), Graziosi, D. B., Alpaslan, Z. Y. And El-Ghoroury, H. S., “Depth assisted compression of full parallax light fields”, Proceedings of SPIE-IS&T Electronic Imaging, 9011, (2015) and Graziosi et al, “Methods For Full Parallax Compressed Light Field 3D Imaging Systems”, United States Patent Application Publication No. 2015/0201176 A1 would benefit from an enhanced compression method that reduces the required decoding processing by doing a piece-wise analysis of the scene and omitting the transform and entropy encoding step. The reduction in decoding time and processing would beneficially lead to a smaller memory footprint and reduced latency, which is ideal for display interfaces using memory and processors commonly available.
As is known in the prior art, there are extremely high-resolution displays that require the use of multiple interfaces to receive source image data. In Alpaslan, Z. Y., El-Ghoroury, H. S., “Small form factor full parallax tiled light field display,” in SPIE Conference on Stereoscopic Displays and Applications XXVI, 2015, a high-resolution light field display formed by tiling multiple small pixel-pitch devices (U.S. Pat. Nos. 7,623,560, 7,767,479, 7,829,902, 8,049,231, 8,243,770 and 8,567,960) is described. The light field display described therein incorporates multiple input interfaces to compensate for the bandwidth limitation of the individual display interfaces commonly used. The lack of high-bandwidth interfaces motivated subsequent development of compression algorithms. In the prior art of FIG. 1, application processor 101 processes and formats light field (hogel) data 111 for transfer to light field display 102. Image generation 103 digitizes the light field (hogel) data 111 acquired by one or more light field cameras and also renders any computer generated scenes required. Encoder 104 compresses the input image to a size that fits the bandwidth limitations of data transmission (TX) TX interface 105 for wireline or wireless transmission. The link 106 between the Application Processor 101 and the Light Field display 102 shows the number B identifying the link bandwidth required. At the light field display 102, the compressed data received (RX) by the RX interface 107 is transferred to the decoder 108 for reconstruction of the compressed anglets. The reconstructed light field image is then modulated by the display photonics 109.
The Video Electronics Standards Association (VESA) Display Stream Compression (DSC) algorithm is a proposed standard for compression of raw video data to be sent to high-resolution displays. The VESA DSC encoder is visually faithful; i.e., the artifacts introduced by compression are hardly perceived by the viewer. The VESA DSC algorithm utilizes sophisticated prediction techniques mixed with very simple entropy encoding methods and was designed with display interfaces in mind; hence, it performs all of its processing on a line-by-line basis and has a very precise rate control procedure to maintain the bit rate below the limited bandwidth of common display interfaces. However, the VESA DSC algorithm does not utilize the block coding structure approach used in common video compression methods and does not take advantage of the highly correlated image structure present in light fields, both of which can provide significant compression gains.
In applications where the intensities of light rays do not change perceptibly as the rays propagate, the light field can be parameterized using two parallel planes, or equivalently four variables (Levoy, M. and Hanrahan, P., “Light Field Rendering,” Proceedings of the 23rd annual conference on Computer Graphics and Iteractive Techniques, SIGGRAPH 96). This parameterization was used in Levoy, M. and Hanrahan, P., “Light Field Rendering,” Proceedings of the 23rd annual conference on Computer Graphics and Iteractive Techniques, SIGGRAPH 96 to capture a light field and reconstruct novel view points of the light field by utilizing light ray interpolation. In order to obtain reconstructed views with high quality and realistic results, oversampling of the variables was required. This imposes a high demand on the capturing and transmission procedures, which then must generate and transmit a huge amount of data. The use of compression methods such as the VESA DSC can reduce the data requirements for transmission interfaces. Nevertheless, this procedure is still based on prediction and entropy coding, which increases the computational resources at the display driver. Furthermore, the procedure does not take advantage of the structure of light field images with the high degree of correlation between hogels.
The aforementioned prior art fails to accommodate high quality, low computational load high-resolution light field transmission methods as is required for practical implementation of a full parallax light field display. What is needed is a compression method that takes advantage of the correlation between hogels and that avoids the computational loading and latency associated with prior art compression methods.