Automatic urban scene object recognition refers to the process of segmentation and classifying of objects of interest in an image into predefined semantic labels, such as “building”, “tree” or “road”. This typically involves a fixed number of object categories, each of which requires a training model for classifying image segments. While many techniques for two-dimensional (2D) object recognition have been proposed, the accuracy of these systems is to some extent unsatisfactory, because 2D image cues are sensitive to varying imaging conditions such as lighting, shadow etc.
Three-dimensional (3D) object recognition systems using laser scanning, such as Light Detection And Ranging (LiDAR), provide an output of 3D point clouds. 3D point clouds can be used for a number of applications, such as rendering appealing visual effect based on the physical properties of 3D structures and cleaning of raw input 3D point clouds e.g. by removing moving objects (car, bike, person). Other 3D object recognition applications include robotics, intelligent vehicle systems, augmented reality, transportation maps and geological surveys where high resolution digital elevation maps help in detecting subtle topographic features.
However, identifying and recognizing objects despite appearance variation (change in e.g. texture, color or illumination) has turned out to be a surprisingly difficult task for computer vision systems. In the field of the 3D sensing technologies (such as LiDAR), a further challenge in organizing and managing the data is provided due to a huge amount of 3D point cloud data together with the limitations of computer hardware.