1. Field of the Invention
The present invention relates to a method for composing a global histogram of values of pixels during transformation of a hardcopy original to an electronic representation by means of an opto-electronic conversion device. The method of the present invention includes the steps of establishing a division of a detection area in stripes, each stripe comprising several scanlines; for each stripe composing a local histogram of values of pixels within the stripe; determining from the local histogram if the stripe originates from the hardcopy original; and composing the global histogram from the various local histograms.
The present invention further relates to a reprographic apparatus and an electronic component.
2. Background of the Invention
Hardcopy originals are commonly digitized by scanner systems in which light from a light source is reflected by the original and converted to an electric signal by an opto-electronic conversion device, further also referred to as an image sensor. This is, e.g. a CCD-array or a CMOS chip, which are examples of linear image sensors. The digitization of this electric signal takes place in an analog-to-digital convertor (ADC) with a certain number of bits for each pixel, which determines an available dynamic range for the digitized signals. Many scanners use a fixed, transparent plate on which the original is placed and move the light source and optical system, comprising mirrors and diaphragms, in such a way that lines of the original are successively projected on the image sensor. At regular time intervals, the signals of the image sensor are digitized by the ADC. This divides the image of the original into scanlines parallel to the image sensor. The discrete elements of the image sensor divide each scanline into pixels.
In principle, the reflectivity of the original at the position of a pixel, also called the optical density, determines the digital output value of the pixel. Important factors influencing the transfer function of the optical density to the digital output value are the strength of the light source, the sensitivity of the image sensor, the exposure time for each scanline, and the geometry of the hardware. Both well reflecting originals, such as office documents on white paper, and less reflecting documents, such as newspapers, are supposed to give a digital value that will be used to generate a “no ink” value, whereas positions on the original that reflect little light will be used to generate a digital value corresponding to “ink”. The “no ink” value is referred to as “white,” and the “ink” value is referred to as “black”. If the image sensor has color filters for different channels the disclosed matter refers to the signal of each color channel separately or to a combination of signals in the color channels. The ratio of the extreme values for the reflectivity is called the dynamic range of the original. This dynamic range is much smaller for newspaper originals than for office documents printed on white paper.
The dynamic range of an original is established by a histogram of occurring values for the pixels in the original. A histogram contains the occurrence frequency for digital values. The adjustment of the dynamic range of an original to the available, fixed dynamic range of the values of the pixels after the ADC (in the case of an N bits ADC this dynamic range is 2N), is commonly done in the digital domain, where algorithms exist that stretch a histogram of obtained digital values of a specific original. An example of such an algorithm is the derivation of a whitepoint and a blackpoint for an original, followed by a linear expansion of the digital values using these derived values.
In principle, only pixels originating from the original are to be used for the histogram. The original often has a size smaller than the size of the detection area, which is the complete area from which pixels are acquired. Therefore some pixels that are retrieved in the detection area stem from the original and other pixels stem from an area outside the original. During retrieval of the scanlines, most of the time, a cover that belongs to the scanner system is placed over the detection area with the original between the transparent plate and the cover. If the cover is closed over the original, the pixels that do not come from the original will come from the inside of the cover, which is usually white. Therefore these pixels will have the value “white.” If the cover is open, the pixels that do not come from the original come from non-reflecting parts. These pixels will have the value “black”. In both cases, if these pixels are included in the histogram, the histogram is not a true representation of the reflectivity of pixels of the original. Therefore the dynamic range is estimated incorrectly. This incorrect estimation especially affects the reproduction of originals of the newspaper type.
There is a problem however, in that the white of the cover is often indiscernible from the white paper of office documents. If the white pixels originating from the white paper are not included in the histogram, the number of white pixels in the histogram is not correct and the histogram is not representative of the document. If the white pixels of the cover are included, the histogram is not representative either. If the histogram is not representative, incorrect white- and blackpoints are derived, resulting in either background toner or ink at the time of printing a copy of the original or in rendering light gray or colored parts of the original as white, causing these parts to disappear against the material that is printed upon. Even if the white of the cover is different from the white of the original, there is a problem with regard to how to discern pixels from the cover and the original, because the level of the white of the cover is variable due to possible light pollution.
If all values of the pixels that need to be scanned could be used for determining a histogram of the values of the pixels in the original, it would be possible to discriminate the pixels belonging to the original by selecting parts of the detection area and determining for each part if it belongs to the original or not by considering the position of the selected part and the occurring values of the pixels in the selected part. However, for a fast document scanner this technique takes too much time. An important constraint for a method for composing a global histogram, which refers to the histogram that is representative for the complete original, is that it is required to decide during acquisition whether or not to include the obtained values, i.e. at a time not all values of the pixels are available. Note that during acquisition, the size of the original is also not available. Another constraint is in the amount of memory that is needed in making this histogram.
A method according to the preamble is known, e.g. from U.S. Pat. No. 5,696,595. In this patent, a method for composing a global histogram for the original is described in which predetermined areas are defined for which an independent decision is taken whether or not they belong to the original. The decision to include values is made based on the distribution of values in a local histogram, which is made for a part of the original. The criterion is that a sufficient number of values are either not completely black, or completely white. If a defined area belongs to the original, a part of the global histogram for the original is replaced by the local histogram of that area. A disadvantage of this method is that the values of pixels in an original having a white background may not be included in the histogram, when these pixels are in an area that is indiscernible from the cover. Therefore the number of white values and the dynamic range of the original are not well reflected in the final histogram. Pixels of the white cover are rightfully not included. However, white pixels in the original are incorrectly not included as well. A problem in the background art is that the decision to include pixels is based on their value only.