Conventional flat bed scanners, such as are used as peripheral devices to personal computers, are dominated in terms of volume of sales by scanners aimed for scanning A4 European standard size documents and US standard letter size documents.
Referring to FIG. 1 herein, there is illustrated schematically in perspective view a prior art scanner of a size and shape suitable for scanning A4 and US standard letter size documents. The prior art flat bed scanner comprises a casing 100 usually made of a plastics material, and generally of a rectangular shape, the casing having on an upper surface, a flat scanning bed comprising a transparent planar glass scan plate 101 of approximate dimensions of 35 cm×25 cm; a movable lid 102 hingedly attached to one end of the casing 100 which can be swung upwardly away from the scan plate 101 for allowing placement of documents directly on the scan plate 101, and which can then be lowered covering the document and scan plate for exclusion of light; a data transfer cable for transferring data to a main processing device, for example a personal computer; and a scanning mechanism contained within the casing 100 for scanning documents placed on the scan plate 101 and generating digital data signals representing a graphical information content of the document.
For scanning larger sized documents for example A3 or US legal documents which do not wholely fit onto the scan plate 101 there are known commercially available flatbed scanners having larger sized scan plates. However, commercially such larger sized flat bed scanners are relatively expensive, because the market for such scanners is far smaller than for A4 sized flat bed scanners. Therefore, economies of scale cannot be applied to larger sized scanners to the same extent as for A4/US letter size scanners. Further, the larger sized scanners have the disadvantage of physically larger size and are generally less convenient and more bulky than A4/US letter size scanners.
There are known prior art flat bed scanners having A4 and US letter sized scan plates which aim to scan oversized documents having size larger than A4/US letter size by successfully scanning separate areas of an oversized document in successive scans, and then stitching together successive sets of image data received in the successive scans to produce a scan image data representing the full sized document. An example of a prior art image processor which aims to scan documents larger than a scan plate is found in U.S. Pat. No. 5,465,163 in which the problem of scanning large size originals such as maps, CAD drawings, A1 size, B1 size, A0 size and the like is addressed.
However, U.S. Pat. No. 5,465,163 does not disclose a practically workable specific algorithm or method for matching first and second image data of a document for producing a full image data representing a whole document of size larger than a scan plate. The algorithm used for matching first and second image data in U.S. Pat. No. 5,465,163 relies on low spatial frequencies in order to stitch together first and second image data. High spatial frequencies are ignored. Therefore, for images which have regular low spatial frequency information, and irregular variable high spatial frequency information, for example lines of text, the disclosed algorithm cannot differentiate between individual lines of text effectively, and in an automated manner. Thus, although the method disclosed in U.S. Pat. No. 5,465,163 may operate adequately for images such as photographs, it fails for various classes of images, in particular text. Additionally, the method disclosed in U.S. Pat. No. 5,465,163 requires a good initialization of first and second image data to be stitched together. That is to say, the first and second image data need to be almost matched initially in order for the algorithm to successfully stitch the first and second image data together. This involves a manual alignment of image data.
There exists a further prior art system in which a conventional A4 sized flat bed scanner is used to scan a first part of an oversized document, producing a first image data, following which a user moves the document on the scanner and a second part of the document is scanned in second scan operation, producing a second image data. Software is provided on a computer device which displays an image of both the first image data and the second image data on a visual display unit screen, wherein a user can match the first image data with the second image data by manipulation of a pointing device, e.g. a mouse or track ball, and by visually matching the first and second images displayed on the visual display unit.
However, this prior art system is cumbersome to use since it involves a high degree of user interaction in matching documents on screen visually. The system requires user skill, and is time consuming.
Prior art software is commercially available for matching images in successive scans. Examples of such software, which are available from Panaview and Photovista can deal with VGA-like image resolutions. However, this software cannot deal with standard 300 dpi A4/letter image sizes, and is aimed primarily at documents having pictures or images other than text. This prior art software is unsuitable for large size documents containing textual matter, since the amount of computational power required by the prior art algorithm means that they will not work on large documents using textual matter, and this cannot be solved by using data processors having higher data processing capacity which are feasible for commercial usage. The methods used in such software are fundamentally inadequate for image processing of text on large sized documents.