Industrial robots repeat the same task with high accuracy and precision. In some industrial applications, such as manufacturing and assembly, robots pick parts (objects), and place the objects for subsequent processing. The robots require the pose of the objects. Any deviations can result in suboptimal performance, or even damage to the robotic arm or object.
Typically, custom designed mechanical and electro-mechanical systems are used to pick the objects with a known pose. In some applications, the objects are first sorted manually to facilitate the picking by the robot.
Robots can use computer vision techniques to determine the pose of the objects before the objects are picked. However, deployment of computer vision enabled robots continues to be limited because of numerous technical difficulties. Current systems can only pick a single non-occluding object from a bin of objects, or well separated objects. Systems have been designed to pick stacked objects, but the precise stacking of objects also needs a complex mechanical system, or human intervention.
Most computer vision systems lack reliability, accuracy and robustness and use expensive sensors and hardware. Current systems lack the capability of picking objects that are randomly arranged in a haphazard manner on top of each other in a pile or in a bin.
The problem of object picking is not new. Some systems use using electro-mechanical devices. Typically, the robot arm is equipped with a specially designed grasper for the object to be picked. However, the robot arm grasper needs to know the pose of the object to be picked. Methods such as precise positioning can be used to present the object in a specific pose to the robot arm. These systems are expensive, lack inter-operability because they need to be designed specifically for each object, and cannot handle objects randomly arranged in a bin.
Computer vision based systems typically use multiple cameras and illumination devices to analyze the scene and locate the object and to provide feedback to the robot arm for subsequent picking operations.
Most 2D computer vision systems can locate the in-plane orientation and location of objects object but cannot determine the out of plane rotation and distance to the object. Typically, those systems require objects to be non-overlapping and placed on a flat surface. Thus, those systems cannot operate on random pile of objects, or a bin of objects.
Some computer vision systems augment the 2D vision system by also calculating the distance to the object from changes in the size of the object in images. However, those systems cannot determine the out of plane rotation, and are often unreliable in their depth estimate. 3D computer vision systems typically use sensors for determining the 3D geometry of the scene.
Stereo vision systems use two cameras to determine the depth of the object. Corresponding features are localized in the images acquired by the two cameras, and the geometric relationship between the cameras can be use to identify the depth of feature points. However, finding corresponding features is a challenging problem, especially for machine objects, which are often shiny and have a homogeneous featureless texture. In addition, stereo vision systems have a high degree of sensitivity to noise during feature localization. Another problem with stereo systems is that the depths are only recovered at the feature points, and not for the entire object. The reduced accuracy can be tolerated for certain applications such as unracking large body panels in body shops, but not for accurate bin picking of small objects with mirror like surfaces.
Laser triangulation uses structures light to generate a pattern on the surface of the object, which is imaged a camera, see U.S. application Ser. No. 11/738,642, “Method and System for Determining Objects Poses from Range Images,” filed Apr. 23, 2007. The laser triangulation can recover a 3D point cloud on the object surface. That technology has been used for applications involving edge tracking for welding, sealing, glue deposition, grinding, waterjet cutting and deburring of flexible and dimensionally unstable objects. Laser based systems require registration and accounting for shadows and occlusions. Laser systems have not been commercialized successfully for general random bin picking. In addition, the use of lasers also leads to safety issues when deployed in close proximity of operators.
U.S. patent application Ser. No. 11/936,416 “Method and System for Locating and Picking Objects Using Active Illumination,” file Nov. 7, 2007, by Ramesh Raskar et al., describes a bin picking system that connects depth edges to form contours, and then uses an occlusion graph to match the contours to obtain the pose. However, that system only tries to find unoccluded objects in the scene and has difficulties when a large portion of the object is occluded. That system also requires an additional segmentation step, which can itself be prone to error.