The present disclosure relates generally to object description. More specifically, the present disclosure relates to describing objects in images using descriptors based on features associated with edge pixels.
Object detection is among the most widely studied topics in computer vision. Techniques used for object detection or recognition from images usually have several characteristics, including: invariance to specific transformations (e.g., similarity transformations, affine transformations, etc.); robustness to photometric distortions and noise; computational efficiency; and, depending on the particular task, the ability to generalize to object categories. Many of the widely used existing object-detection techniques are based on descriptors. i.e., compact representations of local features in images, such as blobs, corners, and other types of salient regions extracted from images. These descriptors may be matched to an archival data structure (which is sometimes referred to as a ‘model library’) to detect objects.
In order for the existing object-detection techniques to be effective, there usually needs to be sufficient information on the object surface for features to be detected and accurately described. In particular, a sufficient amount of this information (which is henceforth referred to as ‘texture’) typically needs to be present in order for the features to be detected in a repeatable manner and to allow the informative features to be described. In addition, the texture often needs to be specific to each object so that successive matching and detection stages have discriminative power.
However, the performance of most existing descriptor-based object-detection techniques is often dramatically degraded with texture-less objects. In particular, the existing description techniques usually provide poor image descriptions and inaccurate matching for texture-less objects. This is a problem because texture-less objects are very common. For example, texture-less objects occur in many computer-vision tasks related to advanced manufacturing, such as visual inspection for process or quality control, as well as robot guidance. Another emerging application in which the objects of interest often may lack feature-rich surface texture is visual perception for service robotics, such as where personal robots interact with typical household materials. As a consequence, texture-less object detection is an active area of research in computer vision.
In contrast with the blob-like image regions in textured objects, the natural and distinctive features of texture-less objects are often edges and their geometric relations. Therefore, many proposed techniques for detecting texture-less objects involve edge-based template matching. In principle, these proposed object-detection techniques can seamlessly detect both textured and texture-less objects. However, the performance of the proposed objected-detection techniques is often degraded when there is significant occlusion and clutter. For example, if a proposed object-detection technique can tolerate a high degree of occlusion (i.e., a small fraction of matching edges has to be accepted in order to trigger detection), the resulting cue or feature is often not unique so that a large number of false detections occurs when the image is significantly cluttered.
In addition, it can also be difficult to scale the proposed objected-detection techniques to larger model libraries. In particular, while efficient search techniques, as well as careful hardware-related optimization, have been used to speed-up the look-up process, in general object detection usually involves matching a large set of views (as determined by the desired degree of pose invariance) for each sought object to the input image. Therefore, the search time often grows linearly with the size of the model library. As a consequence, when a relatively large pose space is explored, only a few models can usually be handled by the edge-based template matching in many proposed object-detection techniques.