The invention generally relates to robotics, and more specifically to an eye in-hand robot.
Many tasks require a robot to detect and manipulate shiny and transparent objects, such as washing glasses and silverware in a sink full of running water, or assisting a surgeon by picking a metal tool from a metal tray. However existing approaches to object detection struggle with these so called “non-Lambertian objects” because the shiny reflections create vivid colors and gradients that change dramatically with camera position, fooling methods that are based on only a single camera image.
Because it can move its camera, a robot can obtain new views of the object, increasing robustness and avoiding difficulties in any single view. To benefit from this technique, the robot must integrate information across multiple observations. One approach is to use feature-based methods on individual images, but this approach does not incorporate information about the viewing angle and can still struggle with non-Lambertian objects. Other approaches create three dimensional (3D) meshes from multiple views but do not work well non-Lambertian objects, which are still considered an open problem.