The present invention relates to the automatic creation of a larger image of an object or scene from a series of smaller source views. Such an image is known as a xe2x80x98montagexe2x80x99 and, particularly when using a microscope and camera combination, enables a higher resolution image to be formed of a much larger region of the object or scene than is possible otherwise.
When recording a scene by means of a camera, three particular limitations may pose problems: firstly, there may be a limited field of view (ie. in the x and y coordinate directions); secondly, there may be a limited depth of field (in the z coordinate direction, ie. that of the camera axis); and, thirdly, there may be limited camera resolution. In particular, these can be severe in the case of an image being recorded (eg photographed) through a microscope.
Prior art solutions to one or more of these problems are as follows:
Accurately reposition the camera or the subject and record multiple images, then xe2x80x98stitch togetherxe2x80x99 afterwards. An example of a system using this technique is the MIA package from Soft Imaging System GmbH (SiS) which generally requires direct control of a microscope stage in order to position the object under the objective lens in a series of positions such that overlapping images can be xe2x80x98stitchedxe2x80x99 together to form a larger image. Disadvantages are that the positions need to be known accurately.
Our Auto-Montage technique in which multiple digitally recorded images are taken at different focus settings of the camera or z-positions of the subject, and then formed into a single image where each pixel is selected from the source image which shows the most contrast (equivalently, is most in focus).
High resolution and hence more expensive cameras. For electronic cameras, cost becomes prohibitive beyond about 2 k pixels square.
Although the present invention is aimed primarily at dealing with the first and third of the three problems outlined above, it may also, in suitable embodiments, deal with second problem.
According to the present invention there is provided a system, for creating a larger montage image of an object from a series of smaller source images of different views of an object, which includes
imaging means for providing, continually in use, images of an object;
data storage means for storing data representative of the montage image of the object;
means for including the data representative of a first and subsequent source images in that representative of the montage image, in order to increase the data stored representative of the montage image; and
means for providing a measure of the extent of matching of the most recent source image provided by the imaging means and individual ones of a plurality of regions of the montage image which correspond in area;
the arrangement being such that data representative of subsequent source images is included in the data representative of the montage image when the measure of the extent of matching is at a maximum, whereby the source image is matched to the montage image at the correct position.
The present invention also includes a method of creating a larger montage image of an object from a series of smaller source images of different views of an object, which method includes
generating a series of source images of an object;
allocating memory within a data storage means for storing data representative of the montage image of the object;
storing data representative of a first and subsequent source images in the memory, in order to store data representative of the montage image; and
measuring the extent of matching of a most recent source image and individual ones of a plurality of regions of the montage image which correspond in area;
data representative of the subsequent source images being included in the data representative of the montage image, when the measure of the extent of matching is at a maximum, so that the source image is matched to the montage image at the correct position.
Preferably, a threshold value of the measure of the extent of matching is preselected so that source images which will not match with the montage image are discarded.
The montage image is thus built up gradually from the subsequent source image data being added to the first source image data which represents a first data set for the montage image. The source image data can be provided by a camera or other image capture device moved relative to the object.
By comparing the most recent source image with regions of the montage image in turn, the source image can be xe2x80x98locatedxe2x80x99 relative to the montage image and the additional data representative of the source image incorporated into the data set defining the montage image so as to expand the montage image. This process can be repeated until the montage image is complete. The comparison may be carried out in a number of different ways, some of which are detailed later, and may be carried out on a subset of the image data of both the montage image and the source images.
The size of the completed montage image may be predetermined, but may be allowed to increase by allocating more memory to the montage image.
Thus, new data is stored representative of a new image at a position at which the new source image closely matches a region of the montage image. If this position has not changed since the previous measurement, the source is stationary and thus there is determined an appropriate position at which to insert the source image into the montage image.
It is preferred to add source image data to the montage image data when the source image data is provided from a stationary image. This provides two benefits:
Any motion blur or interlace distortion of the source images is minimised, so the image pasted into the montage should be free of these artefacts;
If the system waits for the movement to stop before doing the more time-consuming elements of the process (fine position determination, 3D Auto-Montage), movement can be tracked rapidly and the time-consuming elements of the processing can be done when the delay is less significant to the operator.
Advantageously, the system indicates when it has determined the subject to be stationary (and is therefore updating the montage image), and when it is moving. The system may also indicate when it cannot determine the position confidently, either because of poor matching or low overlap. Furthermore, the system may indicate when it requires the user to pause for some reason (such as to save some data).
The system is particularly suitable for use with a microscope and camera combination for capturing highly magnified images of small objects. Preferably, the system includes a computer and software adapted to carry out the steps outlined above, the montage image and the current source image (or a rectangle representing its location relative to the montage image) being displayed on a monitor to provide feedback to a user operating the system (in the case of a microscope, the operator moving the microscope stage in order to re-position the object relative to the camera).
Because of the nature of the acquisition process, it is important that this feedback be given simply and directly. Colours, flashing and audio signals may all be employed so that the user has a xe2x80x98quiet cockpitxe2x80x99 when the system is operating normally, and can concentrate on exploring the subject.
The determination of the appropriate position at which a newly captured image should be incorporated into the montage image may be achieved by a variety of means, including the location and matching of specific or significant features in the image, or by finding the location of the source image at which the difference between the source and the montage images is at a minimum.
Preferably, a similarity measure is used, so that the source is, notionally, shifted across the montage image until a maximum is found in the similarity measure. This may conveniently be achieved by means of a type of cross-correlation statistic called xe2x80x98covariancexe2x80x99. Cross-correlation and covariance are known to be robust similarity measures in the presence of noise, an important characteristic when dealing with images from a microscope.
For example therefore, if s is an intensity from the source image and m is the corresponding intensity from the corresponding position of the montage image, then the covariance c can be calculated at the selected relative position by the equation:   c  =                              ∑                      s            ⋂            m                          ⁢                  xe2x80x83                ⁢                  s          *          m                    n        -                                        ∑                          s              ⋂              m                                ⁢                      xe2x80x83                    ⁢          s                n            *                                    ∑                          s              ⋂              m                                ⁢                      xe2x80x83                    ⁢          m                n            
where S∩M is the intersection of the valid regions of the (displaced) source and montage images, and n is the number of pixels in that intersection. The initial source image is generally considered to be xe2x80x98validxe2x80x99 throughout, whereas the montage image is only valid where it has already been set. (In the current implementation, the montage image starts out with the value of zero, ie it is notionally xe2x80x98emptyxe2x80x99, and only non-zero pixels are considered valid). It makes sense, however, to allocate a xe2x80x98rectangularxe2x80x99 array of memory of some chosen size (say 2000xc3x972000 pixels) from the outset. When a source image is xe2x80x98pastedxe2x80x99 or xe2x80x98writtenxe2x80x99 into the montage image (for which the term xe2x80x98setxe2x80x99 can be used), that part of the montage is then regarded as xe2x80x98validxe2x80x99 or xe2x80x98set to somethingxe2x80x99 rather than being xe2x80x98emptyxe2x80x99. The particular representation of this which has been chosen for one example is to fill the montage initially with zero, then to treat non-zero pixels as xe2x80x98validxe2x80x99 or xe2x80x98setxe2x80x99.
It is useful to consider the validity of each pixel, because otherwise the process tends to xe2x80x98lock onxe2x80x99 to the edge of the valid montage regions (this edge is effectively the most significant feature in the source data).
If the covariance similarity score is plotted against the offset vector V, a single peak in this surface should be seen corresponding to the true offset between the source image and the montage, V1. If, however, non-valid pixels are not discarded from the calculation, a second peak (quite likely even higher) at another offset, V2 may result. Thus, referring to FIG. 2, if the source image lines up with some edges of the montage image, then these xe2x80x98edgesxe2x80x99 give rise to a very high similarity score despite the fact that the image detail does not match up. In other words, the edges are much more significant xe2x80x98featuresxe2x80x99 than the details of the image. This can easily result in the algorithm choosing the incorrect (V2) peak in the covariance function, and pasting the source image into the wrong position.
Even with a source image of standard TV resolution (768xc3x97576 in Europe), if one tries to calculate this statistic over the whole of the montage image of (let""s say) 2000xc3x972000 pixels, in the order of 1013 calculations may be required. To make the system practical with current personal computers, it is therefore advisable to xe2x80x98prunexe2x80x99 the search space as far as possible. First, the search space can be reduced by using previous knowledge of the position of the source image within the montage image, and (potentially) its speed of movement across it. If we are able to reduce the uncertainty in position to (say) 64 pixels either way from the previous (or predicted) position, then this reduces the search process to around 1010 calculations.
Further, we can adopt a coarse-fine technique. Assuming that the noise level is xe2x80x98not too badxe2x80x99, we can achieve a coarse estimate of the position by both sub-sampling the image in both directions (only processing every 16th pixel in each direction, say), and by shifting the source image over the montage image in larger steps (say 4 pixels). In this manner, we can obtain a coarse estimate with perhaps around 106 calculations, which will take around 5 ms on a modern PC. We can then improve the measurement by reducing the sub-sampling and the size of the shift step, but now we need to search over a much smaller area. Thus the total position measurement time can be kept short relative to the camera acquisition time and the speed of the operator. (It may be noted that as the speed of the algorithm improves, the uncertainty in the position diminishes, and the search space can be reduced further.)
An important refinement is that if the coarse measurement indicates that the subject is moving, there is no need for the fine measurement. This means that the system can have the highest update rate when it is most necessary; when the subject stops the more lengthy matching procedure is not objectionable.
The system needs to be xe2x80x98confidentxe2x80x99 of the matching location because in general mistakes cannot be undone once data has been incorporated into the montage image data stored. Two confidence limits may therefore be imposed by the present system:
The covariance measured must lie within some tolerance of the previous accepted covariance measurement;
The number of valid pixels considered in the calculation of the highest covariance score must exceed a threshold. This is effectively the area of overlap of the source with the existing montage pixels, so it can be expressed simply as (for example) xe2x80x9chalf of the source image must overlap with the valid regions of the montage imagexe2x80x9d.
The following describes how the system of the invention may be adapted to work in xe2x80x983Dxe2x80x99 mode, which adds consideration of the z (camera axis) direction. These aspects extend the effective depth of field of the optical system.
In this case the motion tracking process works exactly as in the xe2x80x982Dxe2x80x99 mode, and the consideration of the third dimension occurs only when the subject is stationary and therefore we have a xe2x80x98newxe2x80x99 source image and a position within the montage image where it belongs. At this point the action is somewhat different: each pixel within the source image is considered, and it is written to the montage image only if the contrast in the source image at that point is greater than the best contrast previously recorded for the corresponding point in the montage image. Here xe2x80x98contrastxe2x80x99 is deliberately vague: what we want to measure is how well xe2x80x98in focusxe2x80x99 that part of the image is. Various measures may be used, including local edge information or the variance of the pixels surrounding; all these may be shown to relate to contrast. Implicit in this description is the fact that a separate xe2x80x98best contrastxe2x80x99 value for each point in the montage image needs to be maintained; this adds considerably to the memory requirements.
The xe2x80x983Dxe2x80x99 update process can be described as follows:
For each pixel in the source image
{
If the xe2x80x98source contrastxe2x80x99 here is greater than the xe2x80x98best contrastxe2x80x99 at the corresponding position in the montage
{
Copy the source pixel to the corresponding pixel in the montage image
Copy the xe2x80x98source contrastxe2x80x99 value to the corresponding xe2x80x98best contrastxe2x80x99
}
}
Note that the xe2x80x98best contrastxe2x80x99 is initialised to zero throughout, so that the first source image to arrive at a particular position is always inserted in toto.
The descriptions above have described the case of a monochrome camera. For colour images, the covariance and contrast can be calculated from a single pre-selected colour plane, a colour plane selected to give the maximum overall contrast, or the sum of the colour planes (which effectively computes the equivalent monochrome image).
The system as described uses a montage image of fixed size. However, it is important to react appropriately as the position of the source image approaches an edge of the montage image. Several actions are possible:
The calculations are xe2x80x98clippedxe2x80x99 as appropriate, and areas of the source image that extend beyond the edge of the montage image are ignored. This means that the montage image can be xe2x80x98filledxe2x80x99 right up to its edge.
If the montage image is only partially used (say, we placed the first source image in the centre and subsequently we have only moved up and right), then the contents of the montage image can be shifted within the existing memory, so that this memory can be used more effectively. To achieve this efficiently, it is probably necessary to calculate and maintain the extremes of the valid montage pixels in x and y.
A new montage image (and in the xe2x80x983Dxe2x80x99 mode, corresponding xe2x80x98best contrastxe2x80x99 values) can be allocated to further extend the field of view. The existing data can be copied into the new. The user may need to be instructed to pause while this happens.
The montage image might be xe2x80x98tiledxe2x80x99, with only those tiles currently required remaining in working store (RAM) and the unused tiles being saved on disk. In this way, extremely large images might be acquired. This mechanism is more complex, but has the potential to avoid any significant delays for the user since the disk data may be written in small amounts as a background task.
Manual override of the automatic operation may sometimes be desirable. Three cases have been identified:
A really good image of an extended field of view has been captured, but the user wishes to add to the extent of the montage in a region remote from the present position. In this case, the user should be able to specify that the system should continue to track the motion of the source image across the montage image, but should not update the montage image. When the region is reached where it is desired to capture more images, the user removes this injunction.
Incorrect or poor images have been recorded in some region of the montage. In this case the user may wish to xe2x80x98undoxe2x80x99 recent updates to the montage image (although this will require considerable memory), or more likely to simply xe2x80x98rub outxe2x80x99 these regions. Subsequently, the facility described above will allow the position to be re-synchronised with the montage image, before new updates are permitted.
An incorrect, or poor, image has been recorded in xe2x80x983Dxe2x80x99 mode and consequently the xe2x80x98contrastxe2x80x99 information associated with this region of the montage image is also incorrectxe2x80x94which in turn will mean that subsequent updates in this region will not take place correctly. In this case the user may wish to xe2x80x98refreshxe2x80x99 both the montage image for the region and the corresponding xe2x80x98best contrastxe2x80x99 record.
Whilst it is a significant benefit of the xe2x80x98basicxe2x80x99 invention that it requires no special modifications to the microscope, there are potential benefits from adding control of the xe2x80x98zxe2x80x99 position of the sample, particularly in xe2x80x983Dxe2x80x99 mode. These are:
Knowledge of the z position of the sample means that the (relative) depth of each point can be measured, by detecting the z position that resulted in the maximum contrast. This requires the retention of this information for each point (more RAM), but means that the result of the process is a data set that provides the z position of each point of the subject. This enables 3D measurements of the surface to be made, and visualisations using computer graphic techniques or stereo pairs (Auto-Montage provides such facilities).
The scan in depth (z) can be made reliably and repeatably, and more quickly than by hand. Potentially, the system could automatically perform a z scan every time it detected that the subject was stationary.
Motion tracking will be most reliable when the subject is generally in focus. The system could measure the overall contrast for each z step (the total variance of the intensity values is suitable), and automatically return the stage to the z position which gave maximum contrast after each scan.
Again, while it is a significant benefit of the invention that no motorised stage is required, there may be applications where it is appropriate to use one.
Control of the microscope stage and the image acquisition through a single computer interface may be advantageous.
The present invention may be used to perform a rapid scan of the object, to determine its extent in x, y (and z) and to build up a preliminary view. In this case, it will be appropriate to place the stage under the control of the system, so that its movements can be related to the motion of the image (in other words, the system as a whole can be calibrated). Using both the extent and calibration information, the system might perform a much more sophisticated image montaging scan (in x, y, and z) which would take much longer but which would be fully automatic.
Where the source image contains (or is likely to contain) many repeated elements, such as the features on an integrated circuit, the covariance calculation may show many maxima of similar value, and no clear global maximum will be discernable. In these cases, the use of direct stage control will enable a particular local maximum to be identified as corresponding to the xe2x80x98truexe2x80x99 position.
There may sometimes be a systematic distortion of the brightness and/or colour values in each source image. Possible causes are camera defects (variations in black level or gain across the camera, uneven illumination or variations in the transparency of the optics). Such defects can generally be modelled, measured, and subsequently corrected. Where such artifacts are significant, they will often produce a noticeable edge in the montage image where (say) the left-hand region of one field of view overlays the right-hand region of another. In this case, it is appropriate to correct each field before inserting it into the montage image. As mentioned above, this may be more useful in an automatic scan, where time is not so critical.
Microscope optics frequently collect dust and debris which is difficult to remove. These may result in high-contrast detail in the source images, which will not only appear in the final result but may also distort both the similarity calculations and the contrast calculations. A separate procedure may be used to identify the defects, which are stationary in each frame and independent of the sample. Two refinements to the basic technique are then possible:
The regions of the source image corresponding to the defects can be considered xe2x80x98invalidxe2x80x99 as far as the similarity measurements are concerned. This prevents them from dominating the measurements as they might otherwise do.
The regions of the source image corresponding to the defects can be masked out when the montage image is updated with the contents of the source image. Since the subject can be moved slightly and the montage updated again, the xe2x80x98missingxe2x80x99 regions can be filled in.
It may be desirable to adapt certain algorithms to the characteristics of the source images. For example, the shape of the peak in covariance will be very broad if the image is defocussed, or very narrow if there is a great deal of sharp detail. In such cases it may be necessary to adapt the search technique to suit. In the case of a sharp peak, it may be necessary to consider more possible positions of the source image relative to the montage image, since otherwise the true peak may be fall between the steps in position.
Similarly, by measuring the statistics of the source images, it is easy to derive the expected value of the covariance function. This might be used to determine the confidence threshold, rather than simply working from the most recent accepted maximum covariance value.
If the expected shape of the xe2x80x98covariance vs. relative positionxe2x80x99 function is known (and it can easily be estimated by computing this function using the source image and a displaced version of itself), then from a few points near the peak one can interpolate the exact position of the peak. In the case of a coarse search, this may enable the estimation of the peak location to be achieved more accurately than the basic step size would imply. In the case of a fine search, it can be used to determine the matching position of the source within the montage image to a precision better than one pixel; the source image can be re-sampled to reflect this more accurate position before it is inserted into the montage image.
Some cameras have a poor optical performance which results in an effective resolution well below the pixel resolution. In such cases, it may be possible to obtain better results by reducing the pixel resolution of the source image before it is placed in the montage image. The magnification of the microscope can be altered to compensate, and the invention allows the pixel resolution of the final montage image to be far higher than the original resolution of the source.