The personalization and customization of images as a way to add value to documents has been gaining interest in recent times. This is especially true in transactional and promotional marketing applications, but also is gaining traction in more image-intensive markets such as photo finishing, whereby personalized calendars, photobooks, greeting cards, and the likes are created. One approach to personalize an image is to incorporate a personalized text message into the image, with the effect that the text is a natural part of the image. Several technologies currently exist to personalize images in this fashion, offered by vendors such as XMPie, DirectSmile, and AlphaPictures, for example. In such applications, a photorealistic result is intended to convey the intended effect. At the same time, these approaches are cumbersome and complicated, requiring sophisticated design tools, and designer input with image processing experience. For this reason, designers are often hired to create libraries of stock personalization templates for customers to use. This limits the images that the customer can draw from for personalization.
A natural choice for incorporating personalized text into an image is a location where text already exists, such as a street sign, store sign or banner. The automatic detection of text in images is a very interesting and broadly studied problem. The problem can be further grouped into two subcategories, which include detecting and recognizing text in documents and finding text in natural scenes. Document text detection has been tackled by researchers, and is a precursor to optical character recognition (OCR) and other document recognition technologies. However, text detection techniques applicable to documents work at best poorly, and often not at all, on text found in real image scenes, As the text can typically bear different perspectives and can vary significantly in many different respects, such as size, location, shading, font, etc. Furthermore, the detection algorithm may be confused by other details and structures in the image. State-of-the-art techniques generally make assumptions and thus constrain themselves to a subset of the general problem. For example, in license plate recognition, the license plate images are usually obtained in a controlled environment with little variation in perspective, location, angle, distance, etc. Furthermore, many of these algorithms are computationally costly, which renders them ill-suited for real-time or interactive applications.
What are therefore needed are convenient and automated systems and methods that facilitate automatically detecting text regions in natural scenes in electronic images for use in image personalization and other applications.