The autonomous vehicle revolution (e.g., the emergence of self-driving cars) has necessitated the need for accurate and precise vehicle localization during the vehicle's drive-time. That is, in order to successfully arrive at a particular destination via a safe, legal, and at least a somewhat optimized route, an autonomous vehicle must avoid non-drivable surfaces, stay within a proper lane of a drivable surface (e.g., a road), and navigate intersections, turns, and curves in the drivable surface. Autonomous vehicles must also continually re-assess and optimize its route based on observed drive-time environmental conditions (e.g., traffic congestion, road closures, weather conditions, and the like). Thus, an autonomous vehicle is required to continually (and in real-time) determine and/or update its position on the surface of the Earth, as well as its orientation. Furthermore, in order to ensure wide adoption of autonomous vehicles by the public, such determinations (i.e., localizations) must be precise and accurate enough to achieve a safer and more efficient driving performance than that of a human driver. That is, in order to be viable, autonomous vehicles must perform at least as well as the average human driver.
Some conventional approaches for localizing autonomous vehicles have relied on various Global Navigation Satellite Systems (GNSSs), such as the Global Positioning System (GPS), Galileo, and GLONASS. However, such satellite-based approaches, which determine a position on the surface of the Earth via triangulating satellite-emitted signals, have performance constraints that limit their applicability to autonomous vehicle localization. For instance, the accuracy and precision of various GNSS methods is on the order of several meters, which may not be great enough for autonomous vehicle applications. Furthermore, the civilian-accessible versions of such systems are even less accurate than their military counterparts. Also, environmental conditions such as thick cloud cover or tree cover attenuates the strength of the satellite-emitted signals, which further decreases their performance. Additional degradations in GNNS-based localization performance may result from transmitter-receiver line-of-sight issues (e.g., urban canyons) and multi-path effects (e.g., signal reflections from buildings and other urban structures). Thus, conventional GNNS-based localization may perform particularly poor in urban areas, as well as in other domains. Furthermore, such triangulation-based methods cannot provide an orientation of a vehicle, unless the vehicle is in motion (and assumed to be not going in reverse).
Other conventional methods have employed cellular towers and other stationary signal-emitters (e.g., Wi-Fi routers and/or repeaters) as sources of signals from which to triangulate and determine a location. However, whether these terrestrial-based signals are used as an alternative to satellite-based signals or whether these signals are used to supplement satellite-based signals, such methods still suffer from poor performance. That is, such methods may not provide the localization accuracy and precision required for safe and efficient navigation of autonomous vehicles.
Still other conventional methods of vehicle localization have attempted to correlate three-dimensional (3D) visual features within maps and drive-time images. In order to determine the vehicle's location and orientation, such visual-domain approaches may employ a 3D visual map of a vehicle's environment, generated prior to drive-time, and drive-time generated 3D visual images of the vehicle's environment. In these conventional methods, a vehicle may have access to the previously-generated visual 3D map of their environment. During drive-time, the vehicle captures 3D visual images of their environment, via a light detection and ranging (LIDAR) camera (or other 3D imaging devices). These approaches correlate features in the visual 3D map and visual features in the 3D visual images, and locate the vehicle within the map via the correlation. That is, corresponding 3D visual features such as edges, surface textures, and geometric shapes are matched between the 3D visual map and the 3D visual drive-time images. Such visual-feature mapping, and knowledge of the optics of the vehicle's cameras, enables a determination of the perspective (and hence the vehicle's location) from which the drive-time images were generated.
However, the performance of such visual-domain feature matching approaches are also limited in the application of autonomous vehicles. The 3D visual maps, as well as the drive-time 3D images, require significant amounts of storage and computational processing. The data encoding such conventional maps and drive-time images may be structured as spatially-discretized locations in 3D space and stored via a 3D array of visual features inferred from pixel values. These methods may store the 3D array of features, via vectorized representations, as well as pixel values. The inclusion of all three spatial-dimensions is informationally expensive. Even though some of the 3D information associated with these conventional methods may be somewhat “sparsified” or compressed, the amount of storage required to encode 3D visual maps and 3D images may still result in significantly intractable storage and computational requirements.
Furthermore, matching features in the 3D visual domain (e.g., image registration that correlates 3D edges, surface textures, and geometric shapes) is computationally expensive. Again, the inclusion of the third dimension significantly increases the computational expense. Furthermore, the visual-domain includes numerous complex features, including, but not limited to, various characterizations of edges, surfaces, and shapes. Matching such numerous and complex visual features in three dimensions is computationally expensive. Because the visual-domain feature correlation must be performed in real-time, the computational overhead (e.g., memory and speed) required for such visual-domain feature matching may be unacceptable for some real-time applications, such as the drive-time localization of an autonomous vehicle.
Also, generation and updating of the 3D visual maps is expensive. Conventional 3D visual maps are often generated via a combination of a significant number of 3D visual images taken from a perspective similar to that of the vehicles that will later employ the map. The 3D maps may include 3D point or feature clouds. To obtain the required 3D visual images (i.e., 3D point clouds), fleets of survey vehicles may be required to survey the environment and capture 3D visual images. Furthermore, to obtain images of adequate resolution required to generate a 3D map, LIDAR cameras, or other laser-based cameras, such as but not limited to time-of-flight (TOF) cameras, are conventionally employed. LIDAR cameras increase both the expense of acquiring the images, as well as the amount of information encoded in the image data. Also, the performance of LIDAR and TOF cameras may suffer due to inclement weather conditions, such as rain, snow, fog, smoke, smog, and the like. For example, a scanning laser may be attenuated and/or multiple reflected or refracted via particulates in the atmosphere, and the imagined textures of surfaces may be degraded via the presences of moisture on the surface. Due to changing environmental conditions, these 3D maps require a continual process of updating and propagating the updates to each copy of a 3D map. For example, the shape and/or geometries of the environment may change due to construction activities, or the like. Such environment changes may also change textures of the surfaces, which will effect the performance of LIDAR camera-based approaches. Such updating and syncing requirements are significantly complex to implement.