The following description includes information that may be useful in understanding the present invention. It is not an admission that any of the information provided herein is prior art or relevant to the presently claimed invention, or that any publication specifically or implicitly referenced is prior art.
There have been many recent advances in image processing techniques to recognize objects. One fairly old example includes the use of Scale Invariant Feature Transform (SIFT; U.S. Pat. No. 6,711,293 to Lowe titled “Method and Apparatus for Identifying Scale Invariant Features in an image and Use of the Same for Locating an object in an image”, filed Mar. 6, 2000). Objects can be identified within image data by using SIFT-based descriptors derived from the image data to lookup content information related to known objects where the content information has been indexed according to the descriptor space. Additional examples of recognizing objects include co-owned U.S. Pat. Nos. 7,016,532, 8,224,077, 8,224,078, and 8,218,873.
Such traditional image processing and object recognition techniques are quite suitable for recognizing well understood, specific objects (e.g., a person's face, a finger print, a bar code, etc.). However, they often fail when applied to generic objects lacking, sufficient features for identification (e.g., a logo, a cup or mug, etc.). Furthermore, known methods of edge detection are not suitable for use in consumer grade products due to excessive computational resource requirements, especially when edges are used for object recognition, object classification, object tracking, or other type of object image data analysis. Further, the number of false positives generated by known techniques renders the techniques less than useable in markets where consumers have high expectations for accuracy. One approach that could aid in classifying objects represented in image data could leverage information relating to the apparent edges of objects.
Some effort has been directed to identifying edges and quantifying them use in identifying objects. One example includes the techniques described by Damen et al. titled “Real-Time Learning & Detection of 3D Textureless Objects: A Scalable Approach”, 2012. Damen describes using a Line-Segment Detector and a Canny Edge Map to identify edgelets in image data. The edgelets are used to form constellations of edgelets, which can be used to search for related objects. Unfortunately, the Damen approach is unsuitable for use in resource-limited embedded systems (e.g., cell phones, etc.) because the time to identify edges and process edges is not suitable for frame rate video (e.g., greater than 20 fps) on an embedded device. Further, the Damen approach generates an inordinate number of false positives, which is unsuitable for use in a consumer market that demands high accuracy.
Some progress has been made toward analyzing image data to identify characteristics of object geometries as described by “A Computational Framework for Segmentation and Grouping” Medioni et al. Copyright 2000, Elsevier Science B. V., ISBN 0 444 50353 6. Medioni describes using derived tensor fields from image data to identify geometric properties of objects represented by the image data. The geometric properties are then used to identify shapes within the image data where shapes can be considered to better conform to how a human would perceive the shapes. Although useful for identifying presence of shapes, Medioni fails to provide insight into how to resolve the issues related to high consumer demand. For example, the tensor algorithms used in the Medioni approach are quite computationally intensive.
Interestingly, the Medioni tensor-based approach provides a saliency measure that represents a perceived importance for geometrical structures. However the saliency measure is only used internally when determining the geometric structures. What has yet to be appreciated is that a saliency measure can be leveraged beyond mere identification of geometrical features. As described by the Applicant's work below, saliency, among other metrics, can also be used to provide an indication of which edges are perceived as being most important to work with, thus decreasing compute time and decreasing false positives.
All publications herein are incorporated by reference to the same extent as if each individual publication or patent application were specifically and individually indicated to be incorporated by reference. Where a definition or use of a term in an incorporated reference is inconsistent or contrary to the definition of that term provided herein, the definition of that term provided herein applies and the definition of that term in the reference does not apply.
Thus, there is still a need for improved edge-based recognition systems capable of quickly reducing false positives.