Technical Field
The present description relates to object recognition.
One or more embodiments may apply to recognizing objects for which color information may be discriminating.
One or more embodiments may apply to recognizing objects in a mobile communication environment.
Description of the Related Art
Object recognition is an interesting branch of computer vision.
For instance, searching content among, say, billions of images may be a complex task; conventional approaches still widely used, such as text-based, low-level or semantic approaches, may prove to be hardly satisfactory when dealing with massive amounts of data interactively.
Attention has been paid to the possibility of processing queries by other means, such as Content Based Image Recognition (CBIR). A basic concept underlying CBIR is analyzing the visual content of images, rather than relying on metadata; the visual content of images may be analyzed by resorting to various algorithms and techniques developed in different fields such as statistics, pattern recognition and computer vision.
Over the years, CBIR has attracted an increasing amount of attention in the area of research and is now expanding towards commercial applications. For instance, adaptation of CBIR techniques to a mobile scenario or environment may lead to Mobile Visual Search (MVS) applications, which may provide an intuitive, seamless, direct way of presenting information enabling Augmented Reality (AR) with a completely new perspective, e.g., by interacting directly with an object. For instance, a user may take a photograph of a certain (rigid) object and receive information about it, with (possibly augmented) audio, video, 3D graphics contents.
The Tesco Homeplus Virtual Subway Store in Korea provides a non-conventional way of buying supermarket goods: images of objects available for purchase are displayed on subway walls and, while waiting for trains, subway users can purchase goods with their mobile phones by using QR code markers.
The possibility would otherwise exist of recognizing these objects by dispensing with barcodes or QR codes, for example by taking advantage of the MPEG CDVS standard.
Various algorithms permit visual feature extraction and compression, e.g., by extracting features from gray-level images, that is by taking into account the luma (grayscale) information. Interest point detector and descriptor techniques may permit to recognize objects with certain characteristics (e.g., rigid, un-deformable objects). Grayscale interest point detectors and descriptors can thus can be used in visual retrieval systems where people interact with objects in an augmented reality scenario. Visual retrieval systems based on interest point detectors and descriptors may permit, e.g., to recognize goods in a supermarket.
A relevant factor in applications such as the “virtual supermarket” hinted at in the foregoing lies in that certain articles as presented in supermarkets or stores may be quite similar and differ only for the combination of colors of the package (e.g., the distribution across the image, the dominant colors and so on). For instance, two CDs may actually be two versions of a same album, an “old” release and a “new” release, possibly remastered and/or including one or more bonus tracks, with the two versions differing only for a different color and/or some small lettering on the box/cover. This may well apply also to many types of man-made objects.
Also, certain visual retrieval system like the CDVS Test Model discussed in “CDVS. Test Model 6: Compact Descriptors for Visual Search, W13564,” Incheon, Korea, 2013, may be unable to reliably distinguish so-called challenging objects, that is objects of a very similar class of objects.
While certain interest point detector and descriptor algorithms can extract and codify also color information, their possible application to, e.g., a mobile communication context may be limited by factors such as bandwidth or traffic requirements or the desire of ensuring compatibility with previous implementations.