Mobile devices such as smartphones and tablets are being increasingly used to capture images of documents such as tax forms, insurance claims, bank transactions, receipts, etc. One important function in such applications is the binarization that converts a color or grayscale document image to a bi-tonal image. Binarization can significantly reduce image storage size, and is also a precursor for most optical character recognition (OCR) algorithms. While considerable strides have been made in document image binarization, significant challenges remain in the scenario of mobile capture, wherein images can be captured under a variety of environmental conditions. A common example is capture under low-light conditions that can produce images of low contrast and signal-to-noise ratio. Binarization of such images results in broken, fragmented, or connected characters that result in poor readability and OCR performance. The use of supplemental illumination to ambient light such as a camera flash will usually significantly improve overall capture quality. However a strongly directed flash illumination often produces a specular reflection resulting in a “flash spot” 10 (FIG. 2A) wherein the printed content is no longer discernable. This artifact may result in the loss of a very critical piece of information in a document (for example the monetary amount on a check). So in low-light environments, the result can be an image with low contrast and noise, or a bright image with a hot-spot where content is unreadable.
Prior known methods to solve the problems include capturing a pair of images, one with and one without flash, in rapid succession. The two images are aligned, binarized, and fused in the vicinity of the flash spot region (“FSR”). Fusion is such that the flash image is retained everywhere except in the FSR where content from the no-flash image is incorporated. This method does considerably improve output quality under low-light conditions. However one limitation is that image quality within the FSR is only as good as that of the no-flash image. However, if the quality of no-flash image is not good then the fused image exhibit artifacts such as changed stroke width, broken-characters, noise and the overall experience of reading and using the document is not good. It is preferred to have fusion in such a way that the blended region exhibits similar structural characteristics as the remainder of the document, and there is a smooth transition between different regions of document in terms of quality. Where strong differences in the binarized versions of flash and no-flash images occur, the result is visually disturbing transitions in the vicinity of the FSR. There is thus a need for an improved system that can overcome these problems.