Low-cost depth cameras and 3D sensors, such as Microsoft Kinect®, can be used in robotics and computer vision applications. Typically, the sensors acquire a set of 3D points, i.e., a 3D point cloud, of a scene. Those 3D point clouds are generally noisy and redundant, and do not characterize high-level semantics in the scene.
Typically, primitive are extracted from the 3D point clouds to model the scene compactly and semantically. In particular, planes are one of the most important primitives because many man-made structures have planar surfaces.
Plane Extraction
A typical, method used for extracting planes from a point cloud is based on RANdom SAmple Consenus (RANSAC). That method first hypothesizes several candidates of planes, each of which is generated by randomly sampling three points from the point cloud and determining a set of points (referred to as inliers) that are on the plane defined by the three points. That method then selects from the candidates an optimal plane that is supported by the largest number of inliers. After removing the inliers from the point cloud, that method iterates the process to extract multiple planes. Because that method requires a relatively long time, several variants are known, e.g., Hough transformation and connected component analysis can be applied to the point cloud for pre-segmentation and then RANSAC can be applied to each of the segments. RANSAC can also be applied to local regions of the point cloud, followed by finding connected points on the plane from each of the local regions, called region growing.
A similar but much slower variant is voxel growing. Regions can also be grown from line segments extracted from individual scan lines. Another method clusters the points in a normal space and further clusters according to distances to an origin. To avoid normal estimation per point, graph-based segmentation using a self-adaptive threshold can be used.
Applications Using Planes
Planes extracted in this manner are used in various applications, such as robotics, computer vision, augmented reality, and 3D modeling. Compact and semantic modeling of scenes provided by planes is useful in indoor and outdoor 3D reconstruction, visualization, and building information modeling. Extracting major planes is a common strategy for table-top manipulation using robots, because planes help to segment objects placed on the planar table top. Planes have also been used for simultaneous localization and mapping (SLAM) and place recognition systems, where planes can be used as landmarks of the scene.
Although features in the form of planes are generally more accurate than point features, at least three planes whose normals span a 3D space are required to determine a 6-degrees-of-freedom (DoF) camera pose in such SLAM and place recognition systems. To avoid the degeneracy due to the insufficient number of planes, points and planes can be used together as landmarks.