Computer vision is the scientific discipline of making machines that can “see” so that they can extract information from an image and based on the extracted information perform some task or solve some problem. The image data can take many forms, such as still images, video, views from multiple cameras, or multi-dimensional data from a medical scanner.
Known robotic perception systems achieve desired performance and reliability by engineering specific lighting conditions, structuring viewing conditions and exploiting process configuration. They are flexible under a narrow range of conditions that work only in a subset of real-world conditions, and may breakdown with minor changes in the surrounding environment. In addition, processing speed of known systems and related techniques is not sufficient for efficient real-time processing. Turnkey commercial machine vision systems can be slow when introducing wider flexibility and are made to work robustly by rigorously structuring the domain. For example, processing a large field-of-view (FOV) searching for objects in unexpected orientations that occupy 5-10% of FOV could take several seconds or more. This is further compounded when searching for front/back/side views to precisely find an object location and pose. Furthermore, cost associated with structuring the surroundings for known automation solutions for robot material transfer and handling applications can be three to ten times the cost associated with the robotic device. The range of products that can be efficiently handled can be limited in known automation systems and is often restricted to just a handful of styles. Furthermore, such systems are cumbersome to retool and slow to reconfigure for a different class of products. Thus, existing automation solutions are not readily applicable in assembly operations that deal with a wide diversity of parts due to issues related to investment, operations cost, flexibility and reconfigurability.