Traditionally, copier, in the office equipment context, refers to light lens xerographic copiers in which paper originals are in fact photographed. The images are focused on an area of a photoreceptor, which is subsequently developed with toner. The developed image on the photoreceptor is then transferred to a copy sheet which in turn is used to create a permanent copy of the original.
In recent years, there has been made available what is known as digital copiers. And the most basic functions, a digital copier performs the same functions as a light lens copier, except that the original image to be copied is not directly focused on a photoreceptor. Instead, with a digital copier, the original image is scanned by a device generally known as a raster input scanner (RIS) which is typically in the form of the linear array of small photosensors.
The original image is focused on the photosensors in the RIS. The RIS converts the various light and dark areas of the original image to a set of digital signals. These digital signals are temporarily retained in a memory and then eventually used to operate a digital printing apparatus when it is desired to print copies of the original or a display screen when it is desired to display the image; i.e., the image is scanned and converted to electrical signals so that the image can be used for other reproduction purposes. The digital signals may also be sent directly to the printing device or display device without being stored in a memory. The digital printing apparatus can be any known type of printing system responsive to digital data, such as a modulating scanning laser which discharges image wide portions of a photoreceptor, or an ink jet printhead.
With the migration of the copying and scanning systems to a digital base system, the systems face different problems than from the light lens or analog copying systems. More specifically, in a digital scanning system, the scanning system needs to locate the actual location of the document so that any desired image processing routines can be applied to the correct pixels of image data.
In describing the present invention, the terms pixel will be utilized. This term may refer to an electrical (or optical, if fiber optics are used) signal which represent the physically measurable optical properties at a physically definable area on a receiving medium. The receiving medium can be any tangible document, photoreceptor, or marking material transfer medium. Moreover, the term pixel may refer to an electrical (or optical, if fiber optics are used) signal which represent the physically measurable optical properties at a physically definable area on a display medium. A plurality of the physically definable areas for both situations represent the physically measurable optical properties of the entire physical image to be rendered by either a material marking device, electrical or magnetic marking device, or optical display device.
Lastly, the term pixel may refer to an electrical (or optical, if fiber optics are used) signal which represents physical optical property data generated from a single photosensor cell when scanning a physical image so as to convert the physical optical properties of the physical image to an electronic or electrical representation. In other words, in this situation, a pixel is an electrical (or optical) representation of the physical optical properties of a physical image measured at a physically definable area on an optical sensor.
In a digital scanning system, it is desirable to perform the image processing routines only upon the image of the document and not upon the image data representing the backing of the platen cover in a platen scanning system or the backing roll in a document feeding system, such a constant velocity transport ("CVT") system. In this application, the term backing roll or backing will be used to describe the area scanned by the digital scanner which is not the document or desired image to be scanned.
Thus, it is important in a digital scanning system to determine the actual location of the document being scanned; i.e., the document's edges and document's width. This locating of the document is particularly important in an engineering document scanning system.
In an engineering document scanning system, the input document can be of any size ranging from 5 inches to more than 36 inches. Conventionally, in one method of determining the location of the document, a user would manually determine the document size and input the width, through a user interface, to the document scanning system before the document was actually scanned. In this conventional method, the document must be centered in the document scanning system to avoid the document image from being clipped.
This conventional manual method reduces productivity and causes wasted copies since a user cannot always input the correct width or center the document accurately in the document scanning. Thus, it would be desirable to have an auto-width detection system to determine the document's width and position when the document is being initially staged for scanning.
Various auto-width detection schemes have been proposed so as to determine the document's width and position when the document is staged; however, these detection methods have not been completely successful because the documents to be scanned in an engineering environment can be very similar to the backing, thus making it difficult to distinguish between the paper and the backing (backing roll or the backing of the platen cover).
This problem is accentuated when the scanner CCD sensor output fluctuates. Moreover, it is possible that a document and scanner can interact to produce an integrating cavity effect which will mask the edge location of the document.
One such conventional auto-width detection method captures a portion of the lead edge of a document that is staged wherein the captured portion of the lead edge includes both image data related to the backing and the document itself. In this automated process, the width and position of a document is calculated by determining whether each CCD sensor element is covered by a backing or document.
In other words, the width detection process becomes a classification process. Each CCD sensor element is either covered by the backing or document. To make this determination, the conventional automatic detection method utilizes the mean of each column of pixels of image data to differentiate between the document and the backing. Examples of this mean data are illustrated in FIGS. 3 and 5 of the present application.
However, since the document's brightness varies from very dark to very bright and since most bond paper and film documents have about the same brightness as the backing, the conventional auto-width detection process often fails to detect the actual location and width of the document. Moreover, since the conventional method relies solely on determining the location and width of the document from mean data, which corresponds to a first order function. The mean data is very susceptible to electrical noise within the CCD sensors or dust and dirt within the actual scanning system. In other words, any excessive electric noise, dust, or dirt could readily render the conventional auto-width detection process ineffective.
In addition to being sensitive to electrical noise, dust, and dirt, the conventional auto-width detection process requires a very sensitive filtering routine to detect the transition from the backing roll to the document, the document's edge. This can be seen in FIGS. 3 and 5 of the present application wherein the transitions from the backing to the document (at about pixel numbers 510 and 245, respectively) are represented by a very narrow pulse spike. Thus, the conventional process requires a very sensitive filtering routine to detect this transition which in turn makes the detection process very sensitive to electrical noise, dust, or dirt.
Therefore, it is desirable to have an automatic detection routine which can accurately detect the location and width of the document being scanned without being sensitive to electrical noise, dust, and/or dirt.
The present invention proposes a method and system for providing automatic detection of the width and position of a document which is substantially insensitive to dust and dirt as well as electrical noise wherein the present invention utilizes second order statistics, such as a standard deviation, in addition to the mean information to determine the actual edges of the document. By utilizing second order statistics, the transition between the backing and the document can be more readily found without the use of sensitive filtering routines.