Robotic systems, such as a robotic manipulator containing a gripping component, may be used for applications involving picking up or moving objects. For instance, a robotic device may be used to fill a container with objects, create a stack of objects, or unload objects from a truck bed. In some cases, all of the objects may be of the same type. In other cases, a container or truck may contain a mix of different types of objects, such as boxed items, cans, tires, or other stackable objects. Such robotic systems may direct a robotic manipulator to pick up objects based on predetermined knowledge of where objects are in the environment.
In some examples, a robotic system may use computer vision techniques to determine a representation of three-dimensional (3D) scene geometry. By way of example, a robotic system may triangulate information observed from at least two known viewpoints to determine a representation of 3D scene geometry. For instance, a stereo imaging system can be used to determine the depth to points in a scene, as measured from the center point of the line between their focal points (i.e., the baseline). If corresponding features in two or more images of an object are identified, a set of rays generated by the corresponding points may be intersected to find the 3D position of the object or depth to the object. In some instances, the 3D scene geometry may be represented in a depth map or depth image which contains information relating to the distance of surfaces of objects in a scene from the focal point.