The data collected by LIDAR, time-of-flight imagers, laser scanners, stereo imagers, or other related sensors contains millions of data points that store the spatial coordinates of the each data point along with any other information, such as RGB color information. Advances in sensor technology have enabled such colorized point cloud data to be routinely collected for large urban scenes using both ground-based and airborne LIDAR sensor platforms.
LIDAR (Light Detection and Ranging) is an optical remote sensing technology that measures properties of scattered light to find range and/or other information of a distant target. The prevalent method to determine distance to an object or surface is to use laser pulses. The result of scanning an urban scene, for example, with a LIDAR is millions of data points pn each having three dimensional x, y and z spatial coordinates pn=(x,y,z).
Once the millions of points have been collected the problem is to recognize meaningful objects from the millions of points from objects such as buildings, trees, and streets. Humans do not see millions of points, but instead seemingly effortlessly break the scene down into buildings, trees, cars, etc. Humans are further assisted by prior knowledge of the world, which enables sifting through the seemingly infinite number of possibilities to determine a few plausible ones. For example, humans know that objects such as buildings rest on the ground, and so human use this information to determine the ground plane in the vicinity of the objects.
Estimating where the ground plane is from the millions of collected points collected by a sensor is a challenge. However, if this can be done with reasonable accuracy then the groundwork is laid to recognize other meaningful objects in the millions of collected points.
Some prior art techniques to recognize objects, as well as the ground plane, rely on strict assumptions on the serial ordering of the collected 3D scan lines. One approach is to try to reconstruct surface meshes by triangulation of the points, which can be slow, sensitive to noise, and makes assumptions about sampling density. The prior art also attempts to directly process the individually collected data points, which introduces scalability issues.
Another approach in the prior art is to build intermediate representations that reduce resolution and may be sensitive to quantization. Yet another approach that has been tried is to use level sets and other continuous approximations like B-splines, which have lower memory requirements, but cannot easily handle sharp edges or peaks in the data.
Yet another approach using mesh-based representations requires non-trivial processing to construct and cannot be updated with new incoming data. Other implicit geometry representations such as voxels allow efficient processing, but may be sensitive to missing information and empty cells since they only store local statistics.
All of these approaches attempt to find objects such as buildings, trees, cars and streets, and also attempt to find other more obscure objects such as poles, powerlines and posts. These approaches also attempt to estimate the ground plane. However, all of these prior art approaches have disadvantages and are not robust.
Recent advances in range measurement devices have created new challenges for fast 3D modeling of large-scale outdoor environments. The prior art has not been shown to work effectively on high resolution aerial and terrestrial data to recognize a wide spectrum of object types, including objects such as powerlines, posts, poles, construction cranes, and recognition of merged cars that occur because they are closely parallel parked, for example, or parked side by side in a parking lot. Also prior art approaches for detecting woods and forested regions have relied on 2D imagery rather than 3D sensor data. Thus, these prior art techniques do not work well in recognizing these objects in 3D sensor data.
What is needed is a method for estimating the ground plane and recognizing objects such as buildings, powerlines, posts, poles, construction cranes, and merged cars. Also needed are methods for detecting woods and forested regions from millions of collected 3D data points. The embodiments of the present disclosure answer these and other needs.