1. Field of the Invention
The specification relates to a system and method for performing word detection. In particular, the specification relates to detecting words for optical character recognition (OCR) from an image, invariant to the local scale, rotation and position of the words.
2. Description of the Background Art
There is a gap between printed and electronic media. Software currently exists for bridging the gap by performing OCR on an image to identify text and performing a subsequent action on the identified text. One action includes submitting the identified text to a database to find a matching result. For example, a user can capture an image of an object, for example, with a camera or a smart phone, and send the image to the software. The software identifies the image and provides the user with a website for purchasing the object or learning more information about the object.
Recognizing text from an image is useful because the text not only provides high level semantic information about the content in the image but also can be used to search for related information. However, the recognition of the text is challenging due to distortions of the text in an image. For example, in-plane rotational distortion is rotation along the normal vector that is perpendicular to the plane containing the text. Out-of-plane rotational distortion is rotation along any vector that might introduce perspective deformation of the text. When text in an image has a dominant orientation, the current OCR approaches may work well. But when text with multiple orientations is present in an image, the current OCR approaches expect the text to all be in the same orientation and, as a result, fail to identify text in multiple orientations. For example, an image of a book could include text on the front cover and text along a spine of the book. In another example, the image could include text in both a horizontal direction and text in a vertical direction. As a result the outputs of the current OCR approaches become unreliable.