Virtual reality (VR) allows simulation and training providers to deliver rich and immersive virtual content. Mixed reality blends virtual scenes and real scenes into a single three-dimensional immersive scene. Mixed reality generally utilizes a method of real-time video processing, extracting foreground imagery from background, and generating a blended scene to a user display, which combines desired real-world foreground objects with a virtual background. Mixed reality user training enhances VR by engaging user muscle memory and providing tactile feedback, which are critical components of learning. Mixed reality allows a trainee to handle real equipment, which the trainee would use in the field, and allows for multi-user training scenarios where teammates can see each other in the same three-dimensional virtual environment.
Mixed reality systems require foreground separation. Foreground separation involves identifying which real objects in a user's field of view are to be included in a mixed reality scene and identifying which real objects are to be hidden by a virtual environment. Ideally, foreground separation should be performed in a real-time, frame-by-frame basis and with an absolute minimum of latency in order to preserve the real-time, immersive feel of the blended environment. Currently implemented foreground separation methods, however, have serious drawbacks, which limit the quality of the mixed reality experience. For example, chroma keying (often referred to as “greenscreen” substitution), which is often used with weather news and in computer-generated imagery (CGI) movies, is a fast and efficient algorithm that uses a specified color to identify all of the background. Chroma keying is cumbersome to work with, however, because chroma keying requires a fixed site, which requires, for example, significant setup time, precise lighting to ensure uniform background color, and regular cleaning. Additionally, chroma keying fails to support the easy deployment goals of the “train anywhere” Department of Defense (DoD) initiative. Other currently implemented foreground separation methods utilize depth-sensing technology; however, depth-sensing technology lacks the resolution and low latency requirements to support real-time mixed reality processing. Currently implemented foreground separation methods that utilize depth-sensing technology cannot be used with mixed reality systems to achieve desired levels of realism, such as 4K video resolution at 90 frames per second (FPS). Additionally, infrared-based depth sensing technology is highly dependent on scene lighting and materials used because infrared-based depth sensing is based on reflection of infrared light back to a sensor. As such, some foreground objects, which may not reflect infrared light well, tend to get dropped from the scene.
Further, stereoscopic vision foreground separation methods, which utilize two-dimensional convolutions of left eye versus right eye images to find common features, are very compute-intensive such that the stereoscopic vision foreground separation methods cannot keep pace with specified frame rates.