The present invention relates to the Optical Character Recognition (OCR) systems and in particular a new method of binarization in such an optical character recognition system.
Binarization of scanned gray scale images which is the first step used in a document image analysis system such as an optical character recognition system. This method consists in labeling each pixel as text or background.
In automatic parcel sorting systems, the address is read and decoded from gray scale image. The image is captured by a camera located above the parcel, while it travels on a conveyor. For obvious reasons, the system is located in an open space building and functions under hard conditions. The gray scale images suffer from unstable light conditions, distortions due to inclination and pixel smearing, poor contrast, and varying resolution due to the changes of the parcel height. Moreover, parcels are covered with plastics which cause reflections, tapes which truncate the address label, logos, textures and graphics. All these variables make the binarization process in sorting system very complicated. The choice of the xe2x80x9cbestxe2x80x9d binarized method is very difficult. A binarization method of several steps is required in order to treat this large range of images.
An evaluation of the different methods of binarization (see the article xe2x80x9cGoal-directed evaluation of binarization methodsxe2x80x9d by 0. Trier et al published in IEEE Transactions on pattern analysis and machine intelligence, vol 17, No. 12, December 1995) has shown that the Niblack method with a postprocessing step appears to be the best one. However, in this method, the threshold estimated is an absolute threshold above which the pixel is set to text and the rest is set to background. The width of the text stroke is not considered in such an estimation and conversely, the background pixels contribute to the estimation, which results in an inaccurate threshold.
Accordingly, the main object of the invention is to achieve a method of binarization in an OCR system, which uses an estimation of the relative threshold between the intensity of the text and the background based upon only the pixels of the text.
The invention relates therefore to a method of binarization. In an optical character recognition system wherein a scanned gray scale image contains text to be recognized in the form of strokes having a known stroke width corresponding to several image pixels, such a method consists in determining text pixels by checking, for each pixel, that the difference between its value and the values of a plurality of pixels located at a predetermined distance therefrom is greater than a relative threshold corresponding to the difference in intensities between the text and the background of the image, subsampling the image at a rate corresponding to at least two pixels in order to detect kernels of text, and binarizing the image pixels only in tiles of several stroke width sides containing text kernels by using in each tile, an absolute threshold estimated in this tile.
According to a feature of the invention, the step of determining text pixels consists, for each analyzed pixel, in checking that either one of the differences between the value of the analyzed pixel and the value of the two pixels located at each intersection of a circle centered at the location of the analyzed pixel and having a radius equal to the stroke width with each one of the row line, column line and both lines at the angle of 45 degrees, is greater than the relative threshold.
According to a preferred embodiment of the invention, the relative threshold is the threshold corresponding to the tail of the dominant lobe of the histogram giving the number of tiles having side of a predetermined size, preferably equal to the stroke width, and which are fully filled with pixels determined as text pixels in function of a threshold value which is the minimal difference between the value of the analyzed pixel and the values of the two neighboring pixels located at the intersection of the circle and one of the lines for which the conditions to set a pixel to xe2x80x9ctextxe2x80x9d are fulfilled.