                Methods are known where additional sensors installed in the document transport means, such as for example retroreflective sensors, are used to determine document width. The accuracy of recognition is limited here by the orientation and the number of sensors. This method is thus appropriate only for recognizing a limited number of formats, such as for example the various DIN formats.        
In the case of a scanner with central guide without mechanical supporting aids to center the document, the position of a document must also be determined in addition to the width so as to allow the user to guide the document within certain tolerances. This makes use of the above-described method impractical even for a limited number of formats to be recognized since the accuracy of recognition is low even if the number of light sensors is high.
In addition, methods are well known that determine the document width by evaluating image information. To do this, the leading end of the document including the leading edge and the side edges of the document are tracked by the image-detecting elements of the scanner. This can be implemented by a prescan that precedes the actual scan, or can be achieved by capturing the entire document at the maximum scan width and then extracting the leading end from the image data. The advantage of these methods is that no additional sensors are required to detect document width.
In order to determine the width of a document from this image data, the approach must decide which pixels belong to the document and which ones belong to the background and/or the reflector. In image processing, problems of this sort are called segmentation.
A known method of segmentation and a method that can be employed for width recognition is the “threshold value method” that uses brightness-based detection. Here the pixels are assigned to the document or reflector based on their brightness or color. The pixel lying the furthest left or right and classified as a document pixel is interpreted as the transition between reflector and document, thereby enabling the width and the position of a document to be determined. This method exploits the fact that the color or brightness of the reflector, that is the reflector roller, which are of course known, generally do not match the color or brightness of the document.
A number of interfering factors occur in practice, however, that result in a very low hit rate for the method when used for width detection in a large-format scanner:                contamination on the reflector (reflector roller(s));        contamination of the optics for example the glass plate; and        changes in the brightness value of the reflector as the air gap between glass plate and reflector roller changes, which changes occur, in particular, for thicker documents.        
A further disadvantage of the method is that the color and/or brightness of the reflector must be different than that of the document. Aside from width detection, however, a white reflector has proven to be advantageous since this appears white in the scanned image for example when there are holes, etc., in the document, and thus matches the document background (paper white) that occurs most often. A white reflector is also optimal in the case of transparent originals. The problem that occurs, however, in connection with width recognition by the threshold value method is that reflector color and/or brightness, and/or document color and/or brightness, cannot be distinguished, or can be only barely distinguished, in particular, at the margin. As a result, the method cannot be used under these conditions.
Other well-known approaches are called edge-oriented methods for segmentation. These methods search for edges in the image that generally represent object transitions. In the case of the application as described, these object transitions occur between document and reflector, and are then recognized by the method as an edge. These edges also appear whenever the reflector and the document are of the same brightness or color, due to the fact that the arrangement of the light source in scanners using CI sensors (contact image sensors) produces a shadow at the transitions between reflector and document. This is one major advantage this method has over the above-referenced threshold value methods. In order to determine the document width, the method must determine the transverse edges of the paper on the left and right sides of the original.
In practice, however, these methods too result very frequently in faulty detection since the edges at the transition between reflector and document are often weaker than so-called “spurious edges” that are created by interference factors.
These interference factors are created by                contamination of the reflector (reflector roller(s), and        contamination of the optics (for example, glass plate).        
In particular, contamination of the reflector cannot be avoided in practice, and in the case of a scanner using nondriven, cascaded reflector rollers produces transverse “spurious edges,” since individual rollers are not made to rotate during the scan. These “spurious edges” are transverse bands on the reflector roller and/or glass plate that are the result of rubbing when the scanning original moves over them.
Modern CI sensors using several light sources furthermore only produce a very weak shadow at the transverse transitions between reflector and document, and the result is therefore only a very weak transverse edge.
Also well-known are model-based methods. Here a search is effected within the image based on a model of the objects. Width recognition here involves a model that searches for corners. This method is more robust in response to interference factors than the above-described, edge-oriented method. This disadvantage of the method, however, is that documents are not recognized correctly which do not conform to the corner model due to dog-eared sections or tears. In addition, faulty recognition often occurs with this method in combination with modern CI sensors using multiple light sources that only casts a very faint shadow at the transverse edges of the document.