The human visual system is capable of sampling information from a wide field of view. Immersive display systems, which allow an viewer to sample information from a wide field of view, must be perceived as displaying high resolution information across the viewer's entire field of view. The data requirements for maintaining high resolution information across the entire display can be substantial. For example, a highly immersive display may require that information be displayed within a 120 degree vertical by 180 degree horizontal field of view. Assuming 60 pixels are required for the display of one linear degree of visual angle, the immersive display system must allow nearly 78 million pixels or about 230 Mbytes of information to be retrieved, transmitted and displayed for a single three color, 8 bit still image. This amount of information can be multiplied by 30 or more when displaying the sequential frames of video information. Unfortunately, current information retrieval and transmission systems do not allow the transmission of this amount of information in real time.
Many image compression techniques have been discussed within the existing art that can reduce the amount of memory needed to store an image and bandwidth to retrieve and transmit an image. Unfortunately, commonly used techniques, such as JPEG or JPEG 2000 compression, typically reduce the amount of information required by a factor of 50 or less, which is not sufficient. Motion image compression schemes such as MPEG are also limited when attempting to compress images for truly immersive display systems. All of these compression schemes attempt to provide images with equivalent fidelity across an viewer's entire field of view. However, it is well known that the human visual system is not isotropic and that the resolution of the eye decreases rapidly with increased eccentricity from the point of gaze. This property of the visual system provides an opportunity for more efficient display systems.
Display systems have been discussed in the prior art that take advantage of the non-isotropic properties of the human visual system. These systems make use of foveated images, where the fidelity of the image is highest at the point of gaze and then decreases away from the point of gaze. For example, Girod in Eye Movements and Coding of Video Sequences, SPIE: Visual Communications and Image Processing, 1988, vol. 1001, pp. 398–405 discusses the possibility of constructing a gaze contingent display system without providing details on the implementation of such a system.
Geisler et al. in International Publication WO 98/33315 published Jul. 30, 1998, discuss the use of gaze contingent information to reduce the transmission bandwidth of imagery in remote pilotage applications. While Geisler et al. discuss the filtering of the high resolution image to produce a foveated image, this filtering occurs immediately after capture, and information that is not required for transmission is discarded. Wallace et al. in U.S. Pat. No. 5,175,617 issued Dec. 29, 1992, discuss a similar system for the real-time transmission of spatially non-isotropic imagery.
Loschky, et al. in Perceptual Effects of a Gaze-Contingent Multi-Resolution Display Based on a Model of Visual Sensitivity, prepared through collaborative participation in the Advanced Displays and Interactive Displays Fed Lab Consortium, sponsored by the US Army Research Lab, pp. 53–58, also discuss the use of non-isotropic images. However, in their implementation, a different set of image data is stored for each and every potential point of gaze position within the image. This implementation, when combined with proper encoding technology, may have the opportunity to decrease the bandwidth required for image retrieval and transmission, but it significantly increases the required storage as all possible foveated images must be stored for a given image.
It should also be understood that each of the systems described in the prior art assume that only a single viewer will view a display at a time. However, immersive display systems with a very large field of view may still achieve significant bandwidth savings even when the image is rendered to provide multiple points of gaze. The generation of multiple regions of interest within a single image has been discussed within other contexts. For example, Andrew T. Duchowski in Acuity-Matching Resolution Degradation Through Wavelet Coefficient Scaling, IEEE Transactions on Image Processing, 9(8), pp. 1437–1440, describes a method for creating multiple regions of interest in an image, which roughly correspond to multiple areas of high-resolution imagery in an image in which all surrounding imagery is of lower resolution. However, the author does not discuss a means for selecting the regions of interest using numerous points of gaze.
It should also be pointed out that the prior art in gaze contingent display technology does not recognize that all eye tracking devices have some error when determining point of gaze. There is also no prior art that discuss countermeasures to be taken when the system retrieval rate is not adequate to support the optimal image fidelity.
There is a need therefore for a system that utilizes an improved method for efficiently retrieving and transmitting image data in a way that different spatial regions of the image have different fidelity as a function of the distance from an viewer's point of gaze. Further, there is a need for this system to react to other system issues such as multiple viewers, inaccurate eye tracking devices, and extreme bandwidth limitations.