Head mounted optical see-through displays are a class of device that enable the visual Augmented Reality, referred to herein simply as Augmented Reality (AR). AR is a mode of Human Computer Interaction in which virtual content is overlaid in real-time onto a user's perception of the real environment such that the virtual content appears to be physically present in a scene. However, while many advancements have been made, prior art has failed to provide a display system capable of overlaying high resolution virtual content over the entire range of a user's natural Field of View (FOV) while maintaining a comfortable form factor.
Historically, AR display systems have fallen into two categories: Video See-Through and Optical See-Through. In video see-through systems, an opaque display is mounted over the eye intentionally occluding the user's natural FOV, but is used in conjunction with outward-facing cameras such that a near real-time image of the user's environment is presented to the user. Alternatively, optical see-through displays present an image to the user without occluding the user's natural FOV, which is most commonly achieved by projecting a digital image onto a largely transparent surface such as a half-silvered beam splitter mounted over the eye.
The present invention relates specifically to head-worn AR displays. In both AR display variants, when mounted in a head-worn configuration, the displays are typically accompanied by multiple sensors which detect both aspects of the user's environment and motion of the user within the environment. Sensors may include, but are not limited to Inertial Measurement Units (IMU), single video cameras, Global Positioning Sensors, stereoscopic cameras, plenoptic cameras, laser ranging systems, LIDAR, Time of Flight cameras, Infrared (IR) cameras, Red Green Blue-Depth cameras, etc. The objective of the sensors is to provide input to computer algorithms which can localize the users position and orientation within an environment, as well as generate a virtual model of the environment (i.e. a map) with which virtual content can be aligned. Many algorithms used to achieve this objective, most notably to date are Simultaneous Localization and Mapping (SLAM) and Parallel Tracking and Mapping (PTAM) algorithms. Ongoing research in this field seeks to approach near real-time and deterministic output of tracking and mapping algorithms.
Prior art offers many different approaches to optical see-through near eye display architectures, most commonly image projection onto optical combiner lens, image transmission via holographic waveguide coupling, free-form optic eyepieces with optical compensators. As described in U.S. patent application Ser. No. 13/426,379 entitled “Increasing Field of View of Reflective Waveguide” to Robbins et al., the field of view of systems that rely solely on waveguide material properties is inherently limited by the waveguide material critical angle. An alternative to pure optical image relay systems is the scanning Virtual Retinal Display (VRD) as disclosed in the seminal U.S. Pat. No. 5,467,104 entitled “Virtual Retinal Display” to Furness et al. is capable of scanning an image directly onto the retina by modulating the intensity of a collimated light beam in synchrony with the deflected raster scanning of the beam. However, the requirement for intermediary projection optic resulting in an image that converges on the eye pupil, in conjunction with the bulky system architecture near the user's line of sight precludes the practical use of this display for AR applications.
Later derivations of the VRD, such as U.S. Pat. No. 7,365,892 entitled “Scanned Light Display System Using Array of Collimating Elements in Conjunction with Large Numerical Aperture Light Emitter Array” to Sprague et al. disclose a VRD array for the presentation of a tiled image to the eye. However, the architecture requires a dense array of embedded addressable emission points with coupled collimating optics, the presence of which, without costly compensation unaddressed in the art, result in optical aberrations of transmitted ambient light. Thus, while suitable for occluded displays, the architecture precludes its application in near eye optical see-through displays.
Of note, the non-scanning VRD architecture described is U.S. Pat. No. 9,594,247 entitled “System, Method, and Computer Program Product for a Pinlight See-Through Near-Eye Display” describes a system for projecting an image from a sparse, thus transparent, array of Lambertian emitters subsequently, the light from which is subsequently filtered by tiled Spatial Light Modulator (SLM). Similar to Sprague's architecture, the requirement for diffractive components of a SLM in the user's line of sight results in unsatisfactory optical aberrations of transmitted light such as double images and rainbow patterns.
Optical see-through devices may display an image to a single eye such in the case of “eye-tap” or monocular devices, or they may display a stereoscopic image pair to both eyes in the case of stereoscopic displays. The present invention relates generally to binocular optical see-through systems but may also be applied to monocular systems.
While stereoscopic images are capable of presenting virtual content that appears to be at a specified distance from the user, display systems offered in prior art typically simply present purely collimated images to each eye. To maintain simple architecture, most near eye displays simply strive to maintain collimation of the projective image light that can be easily focused by the eye lens in the relaxed state.
When collimated light from an image display reaches the eye, the light forms a flat wavefront which does not require the eye lens to deform (i.e. accommodate), and a sharply focused image can be formed on the retina. From the eye lens perspective, all light from the display, and hence all image content presented by the display is perceived by each eye individually to be at located at optical infinity (i.e. >about 8 m). However, to emulate a nearby point (e.g. <about 8 m) the stereoscopic image pair will prompt the eyes to rotate inward such that their respective lines of sight converge upon a point. In response, the brain, expecting to receive an incident divergent wavefront sends a signal to the muscles controlling the eye lens (i.e. the accommodation-vergence reflex) to deform appropriately to yield a sharp focused image on the retina. Thus, the application of a stereoscopic pair to present a virtual image of a nearby object without concurrently emulating the natural curvature of the incident wavefront leads to an accommodation-vergence conflict that can be the source of physical discomfort for users and diminishes the realism of displayed 3D content due to the lack of accommodative depth cues.