Computer vision is used to enable machines, such as computers, to process a scene of a field of view. Based on the processed scene, a machine can initiate one or more actions or operations. Computer vision systems can operate on two-dimensional (2D) data or on three-dimensional (3D) data.
One industry in which computer vision is used is the manufacturing industry. To illustrate, the manufacturing industry uses 2D computer vision image processing for many tasks, such as defect inspection and object recognition. For sophisticated and/or complex tasks, such as bin picking (e.g., navigating a robot to select a target object from a bin of random placed and stacked objects, 2D computer vision image processing is generally insufficient to generate efficient and effective recognition and selection of desired target objects. To illustrate, in a bin picking situation where a one or more object type are stacked randomly, in different orientations, one on top of each other, 2D computer vision image processing is computationally time consuming, has difficulty recognizing objects in various orientations and various positions, as illustrative, non-limiting examples. For such sophisticated and complex tasks, attempts to implement computer vision have extended into the 3D space.
However, use of 3D computer vision processing in sophisticated and complex environments and applications, such as bin picking in the manufacturing industry, poses several difficulties and challenges to accurately and efficiently classify a 3D object and evaluate the 3D orientation, position of the objects using 3D computer vision processing. For example, industrial components or parts are totally random in orientation and position, which makes conventional multi-view methods (e.g., methods using image data of a scene from two or more image capture devices) complex and inaccurate. As another example, features of some objects are similar in a multi-model scene for industrial parts, which makes texture based 2D recognition methods incorporated into 3D computer vision processing less feasible. As a further example, incomplete features extraction of 3D data due to occlusion and excessive light reflection reduce reliability of acquired 3D and thus reduce effectiveness of 3D computer vision processing to accurately recognize objects. As yet another example, 3D computer vision processing in bin picking has proven difficult to achieve efficiency with good accuracy for industrial applications.
One conventional approach of 3D computer vision processing uses depth image data that is generated based on 2D color features of an red-green-blue (RGB) image. This depth image data approach relies on color and texture of objects in to perform multi-view detection of an object. In industrial bin picking applications where objects (e.g., parts) often lack color and/or texture, the depth image data lacks accuracy in identification and recognition of objects because the multi-view processing used to evaluate the orientation of an object is complex during a template generating phase, has quantization errors, and cannot accurately process and identify textureless objects.
Another conventional approach of 3D computer vision processing uses point cloud processing in which 3D features are extracted from 3D edges and surfaces of the objects to perform template matching. However, the conventional point cloud processing can be time consuming as data in all 3D dimensions are extracted and processed. Additionally, the conventional point cloud processing cannot adequately handle object recognition if objections do not have shape features (e.g., rich edges and/or curvatures), such as relatively planar parts that do not have sharp 3D features, resulting in unreliable 3D feature extraction may be occurred during the recognition process. Therefore, traditional point cloud based methods are slow and cannot adapt to planar part as planar parts do not have rich 3D features.
In light of the above, a method for planar object recognition based on 3D point cloud approach for random bin picking application is proposed in this invention. Specifically, this invention teaches a method with features of converting 3D objects to 2D objects for detection and further use an enhanced 3D object for recognition process in order to enhance efficiency and accuracy of planar object recognition.