Mobile devices which are capable of autonomous operations such as navigation and mapping activities are being developed for a number of applications. Such applications include basic devices which perform simple tasks such as vacuuming to more complex devices which interact more substantially with humans.
As the complexity of the operations increase, the need for the device to understand the surrounding environment may also increase. Thus, while a vacuuming device may only need to avoid obstacles, a device travelling across an area of varied terrain may need to quantify the degree of difficulty presented by various obstacles so as to select the optimum route.
The ability to ascertain both the location of a mobile device as well as the nature of the surrounding environment is complicated by the fact that various navigation systems used with mobile devices require data acquisition over an extended period of time in order to obtain the requisite amount of data to provide a good understanding of the surrounding environment. In certain systems, such data acquisition requires the device to be immobile as the data is obtained and processed. Such systems introduce undesired delay in the movement of the device.
Other mapping and localization methods have been developed which do not always require the device to be immobile during data acquisition and processing. Such systems may utilize simultaneous localization and mapping (SLAM) and 3-D perception. While effective, this approach requires significant hardware investment and a significant amount of computational processing.
Three-dimensional (3-D) models of objects can provide information useful for a variety of applications including navigation control and mapping functions. The creation of a 3-D model of either an object, a structure, or a scene, however, can require the involvement of highly skilled professionals with extensive artistic knowledge, expensive modeling equipment and time intensive efforts.
Various approaches to simplifying the generation a 3-D model of an object or scene have evolved. One common approach incorporates a triangulation system. A triangulation system projects beams of light onto an object, typically using a LASER. The emitted light is then reflected off the object at an angle relative to the light source and an imaging component which is spaced apart from the light source collects the reflection information. Based upon an association of the emitted beam and the reflected beam, the system then determines the coordinates of the point or points of reflection by triangulation. A single dot system projects a single beam of light which, when reflected, produces a single dot of reflection. A scanning line system sends a plane of light against the object, the plane of light is reflected as a curvilinear-shaped set of points describing one contour line of the object. The location of each point in that curvilinear set of points can be determined by triangulation.
Another commonly used 3-D modeling approach is a stereoscopic system employing one or more imaging systems located at known locations or distances from each other to take multiple images of a 3-D object. The captured images are processed with a pattern recognition system that relates the various points of the object in the multiple images and triangulates to extract depth information of these points, thereby obtaining the shape/contour information of the 3-D object.
The systems described above are costly, bulky, and may require substantial knowledge to operate. Accordingly, while providing sufficient data to produce an accurate 3-D model, the usefulness of the systems is limited. Various coding schemes have been developed to allow a fewer number of images to be used in generating sufficient data for generating a 3-D model. One such coding scheme uses binary codes like the Gray code to produce only one-to-one correspondences between projector and camera coordinates. This approach is costly (in the sense of projection quality and measuring time) since high resolution patterns are needed to achieve desired spatial resolution.
Another coding scheme incorporates one dimensional discrete Fourier transforms (DFT). One-dimensional (1-D) DFT phase demodulation requires the projected 1-D fringes to be aligned with the camera. Accordingly, the depth range of the system is limited and susceptible to errors when incorporated into a non-fixed system. Two-dimensional DFT phase demodulation systems similarly suffer from high depth ranges. In principle, the direction of the spectral components needed to retrieve the desired phase information in a two-dimensional DFT phase demodulation system can be determined. The automated design of a suited frequency filter is, however, complicated and rather slow.
Another class of coding techniques is Moiré techniques which are based on the interference of two patterns. Moiré techniques require high carrier frequencies and are therefore sensitive to focal blur. This interference can be generated optically or within a computing system. Optical interference can be mitigated by incorporation of an alignment procedure using a reference grid or, if using the discrete structure of the camera sensor itself, by incorporating a costly system calibration. Computing system interference is mitigated using a signal analysis that is similar to DFT phase demodulation, and which suffers the same shortcomings as the DFT phase demodulation schemes.
What is needed is a system that can be used to generate 3-D data required to support navigation or mapping functions. A system which is inexpensive and that does not requires significant computational processing power would be beneficial.