This invention relates to a method and apparatus for merging images, especially images of documents which have been acquired as separate images.
When a document image is captured by a digital camera or a portable scanner, it is often the case that only part of a page of the document is included in each image. This is because the resolution of the camera or scanner is often not high enough to produce a readable document with a single image. In order to increase resolution of the captured text image, a single page of the document often needs to be captured in several images. Therefore, to be able to convert a text image in order to perform optical character recognition on the complete text so as to, for example, convert the text into a word processing application, it is necessary to merge these images back into one single image after capturing. Such merging generally requires the use of image processing techniques, including filtering, recognition, matching and so forth, which can be complicated and lengthy.
Traditional image processing techniques work at a pixel level for image merging. That is, a correlation between pixels is determined to find the best match between two images. While this method works well for natural images, it fails to deal efficiently with text images which contain only black text and white (or light) background.
For many document images, only black-white text objects are available. Such bi-level images contain little useful information at pixel level for correlation analysis because the correlation values can only be either 1 (for same colour pixels) or 0 (between different colour pixels). In addition, the distribution of the black pixel in an image is so non-regular that there are no obvious patterns that can be used for reliable correlation analysis. As a result, using correlation analysis technique at pixel level, as done for natural image merging, often fails to merge document image properly. In addition, correlation analysis using pixels is very time consuming due to the fact that many pixels have to be used for computation of correlation values over and over again.
Furthermore, although, the portable equipment, such as the camera or scanner compresses the acquired image, for example using the JPEG technique, in order to reduce the amount of memory required to store the image, the digital processing techniques necessary to merge two or more such images still require relatively high processing capability, which is often not available in portable equipment.
The present invention therefore seeks to provide an efficient solution for implementing image merging which overcomes, or at least reduces the above-mentioned problems of the prior art.
Accordingly, in a first aspect, the invention provides a method of merging images comprising the steps of providing a compressed representation of at least two images, each compressed representation including a plurality of index numbers, one or more positions in each image associated with the index numbers, and a group symbol for each index number, forming an indexed image for each image, each indexed image comprising a pattern of index numbers at their associated positions in the image, comparing the patterns of index numbers in a first indexed image with the patterns of index numbers in a second indexed image to determine whether there is correlation between the patterns of index numbers in at least parts of the first and second indexed images, merging the first and second indexed images when there is a sufficient correlation, such that the parts that correlate overlap each other to provide a merged indexed image, and replacing the index numbers in the merged indexed image by the corresponding group symbol to provide a merged image.
In a preferred embodiment, the compressed representation of at least two images is formed by the steps of, for each image, acquiring the image, segmenting the image into a plurality of symbols, each having a position in the image associated therewith, comparing the plurality of symbols to find groups of symbols that are substantially identical and assigning an index number to each such group, storing the symbol for each group, associating the assigned index number with the respective position in the image for each stored symbol, and utilising the assigned index number, the associated position, and the stored symbol to provide the compressed representation of the image.
Preferably, the step of storing the symbol involves storing a bit map of the symbol.
In one embodiment, the step of comparing patterns of index numbers in a first indexed image with the patterns of index numbers in a second indexed image comprises the steps of choosing a first index number at a first position in the first indexed image and determining the positions of at least some substantially identical index numbers in the second indexed image, choosing a subsequent index number at another position in the first indexed image, and determining which of the previously determined index numbers in the second indexed image has a substantially identical subsequent index number at a position corresponding to the position of the subsequent index number in the first indexed image, and repeating the previous step until a sufficient correlation between index numbers in the second indexed image and in the first indexed image is obtained to provide the overlapping parts of the first and second indexed images.
Preferably, the step of repeating takes place until only one subsequent index number in the second indexed image is determined to be in a position corresponding to the position of the subsequent index number in the first indexed image. The images to be merged preferably include at least some text.
The method preferably further comprises, between the steps of providing a compressed representation of at least two images and forming an indexed image for each image, the steps of comparing the plurality of symbols a second time to find second groups of symbols that are at least similar and assigning a second index number to each such second group, whereby there are fewer second groups than said group, storing the symbol for each second group, associating the assigned second index number with the respective position in the image for each stored symbol, storing the assigned second index number and the associated position at least temporarily, and utilising the assigned second index number and the associated position in the step of forming an indexed image for each image, each indexed image comprising a pattern of said second index numbers at their associated positions.
The indexed images are preferably virtual images.
In a preferred embodiment, the method further comprises, between the steps of providing a compressed representation of at least two images and forming an indexed image for each image, the step of correcting the positions in each image associated with the index numbers to substantially compensate for skew in the image.
Preferably, the step of comparing the patterns of index numbers in a first indexed image with the patterns of index numbers in a second indexed image to determine whether there is correlation between the patterns of index numbers in at least parts of the first and second indexed images comprises the steps of ordering the index numbers of the first indexed image according to their associated positions in the first image, ordering the index numbers of the second indexed image according to their associated positions in the second image, aligning the first and second indexed images in a first direction, and correlating the patterns of index numbers in the first and second indexed images in a second direction, substantially perpendicular to the first direction, to find a position at which the correlation between the patterns of index numbers is at a maximum.
The step of aligning the first and second indexed images in a first direction preferably comprises moving one of the first or second indexed images relative to the other of the two images in the first direction and determining the relative position at which there is a maximum correlation between them.
The step of correlating the patterns of index numbers in the first and second indexed images preferably comprises the steps of determining the correlation between the index numbers in the first and second indexed images at a first relative position, moving one of the first or second indexed images relative to the other of the two images in the second direction by one index number to a next relative position, determining the correlation between the index numbers in the first and second indexed images at the next relative position, repeating the moving and determining steps for all possible relative positions, and determining at which relative position there is a maximum correlation between the index numbers in the first and second indexed images.
According to a second aspect, the invention provides a system for merging images comprising an input terminal for receiving a first set of data regarding a first image and a second set of data regarding a second image, each set of data including a plurality of index numbers and positional data associated with each index number indicating at least one location of a symbol associated with that index number in the image, a storage device coupled to the input for storing first and second virtual images made up of patterns of index numbers positioned at the location(s) of their associated symbols, a comparison module coupled to the storage device for comparing the patterns of index numbers in the first virtual image with the patterns of index numbers in the second virtual image to determine whether there is correlation between the patterns of index numbers in at least parts of the first and second virtual images, a merging module coupled to the comparison module for merging the first and second virtual images when there is a sufficient correlation, such that the parts that correlate overlap each other to provide a merged virtual image, and an output terminal coupled to the merging module for providing amended first and second sets of data where the positional data associated with each index number in the first and second sets of data has been amended for at least those index numbers in the overlapping parts of the first and second virtual images.
In a preferred embodiment, the system further comprises a document decompression system having a decompression module, a database coupled to the decompression module for storing the first set of data regarding the first image and the second set of data regarding the second image, an output coupled to the input terminal and an input coupled to the output terminal for receiving the amended first and second sets of data, and a document image reconstruction module having an input coupled to the database for receiving the amended first and second sets of data and for reconstructing a merged image therefrom.