Location and head-tracking within an indoor environment for multiple mobile dismounted users, e.g., soldiers on different levels of a multi-story facility, each having a head-mounted display (HMD), may be achieved by several methods, each of which has particular weaknesses and limitations. For example, an augmented reality (AR) system such as the Oculus Rift uses inertial and optical sensors to track head pose data (e.g., the position of the wearer's head as well as its orientation to a given frame of reference), but requires an external optical reference to establish the reference frame. Further, the Oculus system is physically connected to a computer for processing; the system is therefore constrained to a single room and thus insufficiently mobile for a complex multi-level environment.
By way of another example, the Microsoft Hololens system uses a depth sensor (e.g., time-of-flight) to map its surroundings, and then uses point-cloud processing in combination with inertial sensors to maintain and track head pose during dynamic head movement. However, the Hololens system requires room calibration and measurement prior to use and is also range-limited due to its dependence on measuring the return time of reflected infrared (IR) pulses. In addition, many surfaces frustrate or resist mapping via depth sensors due to their material properties, e.g., surface reflectivity.
Optical features may be tracked using cameras or image sensors in conjunction with inertial sensors and image processing. Similarly, physical features may be tracked using depth sensors, inertial sensors, and point cloud processing. In both cases, however, such systems may be sensitive to obscuration of the objects or markers that must be detected to establish the reference frame. In addition, using scene geometry alone complicates the maintenance of precise position and orientation; a map of optical objects and references must be well-known so that position and orientation may be ascertained from the reference points. The necessary processing load presents significant size, weight, power, and cost (SWaP-C) challenges for mobile implementation. The use of scene geometry to maintain precise position and orientation is difficult without pre-calibration; if reference objects or markers are placed in an uncalibrated environment, there are significant requirements for infrastructure setup and configuration. In addition, depth scanning devices have problems with surface composition, which can cause dispersion of the structured light or IR light pulses.