The exemplary embodiment relates to object localization and finds particular application in connection with a system and method which uses segmentation information from similar images to localize a target object region and to segment a selected image.
Object localization (OL) relates to determining the location of a target object in an image. Image segmentation approaches have been developed for locating specific types of objects in photographic images. For example, photographs of vehicles may be subjected to segmentation techniques to identify the region of the image which corresponds to a license plate. OCR techniques may then be applied to this region to identify the license number or to see if it matches another license plate.
Existing segmentation techniques are based on heuristics which exploit the a priori known characteristics of the object to be segmented, such as characteristics of text. For example, some exploit the frequent presence of horizontal and vertical edges. See, for example, Wonder Alves, et al., “Text localization in scene images by morphological filters,” in SIBGRAPI, 2009, and Toan Dinh Nguyen, et al., “Tensor voting based text localization in natural scene images,” IEEE Signal Processing Letters, 17, July 2010. Others rely on high local contrast or constant stroke width. See, for example, Paolo Comelli, et al., “Optical recognition of motor vehicle license plates.” IEEE Trans. on VT, 44, November 1995; Boris Epshtein, et al., “Detecting text in natural scenes with stroke width transform,” in CVPR, pages 2963-2970, 2010. These techniques have rather narrow applicability, since the prior knowledge of the images of interest is incorporated into the software, and therefore such methods do not generalize well to other segmentation tasks.
One approach for data-driven object localization (DDOL) is described in copending application Ser. No. 13/351,038, filed on Jan. 16, 2012, entitled IMAGE SEGMENTATION BASED ON APPROXIMATION OF SEGMENTATION SIMILARITY, by José Antonio Rodriguez Serrano (hereinafter, “the '038 application”), the disclosure of which is incorporated herein by reference in its entirety. As described therein, the object location for an image is determined by first computing the similarity between the image and each of a set of database images in which a similar object of interest has been localized, and then transferring the location information of the object in the most similar image to the input image or combining the locations for the top-k most similar images. In the case of license plates, for example, the method is able to yield a high accuracy when the database images are small (˜200 pixels wide), the target text region occupies a significant portion of the image, and the text region is biased to be near the center of the image.
It would be desirable to be able to apply the DDOL approach to more complex localization situations, such as locating the license plate given an image of an entire vehicle. Here, the object of interest only occupies a relatively small region of the image. The exemplary embodiment provides a two-stage method of object localization which is particularly useful for such situations.