1. Technical Field
The present invention relates to methods of identifying objects in an image, and in particular, to identifying the orientation and position of a grid in an image, and using the resulting grid for measurement.
2. Description of the Related Art
Many applications of machine vision and analysis analyze images which contain information arranged in a grid. For example, WO 99/08233 describes a large number of different types of analyses which can be conducted on assay well plates, gels and blots used in chemical, biochemical and physiological analyses. In that use, the positions of elements on a grid are identified and the elements are probed using a wide variety of assays.
WO 98/47006 takes this a step further, by using the image to identify locations of points on a grid, and injecting reagents onto the grid, e.g., with an ink jet sprayer.
In both cases, however, the position and orientation of the grid are generally known. WO 99/08233 suggests allowing for some slight variation in the grid through the use of fuzzy logic, but the nature of the fuzzy logic to be applied is not described in the application. WO 98/47006 simply assumes one knows the orientation and size of the grid.
Moreover, WO 99/53319 has recently suggested using shrinkable plastic. In this structure, the initial components of the grid are applied to plastic when it is large and the grid elements can be applied easily. The plastic then is shrunk, to reduce the size of the grid elements and resultant density of the elements being tested. This potentially makes manufacturing simpler and less expensive. However, such plastics may not shrink uniformly, so that the grids may be significantly distorted after shrinkage compared to grids on conventional glass or nylon substrates. Current image analysis systems are incapable of automatically identifying the positions of elements in such distorted arrays.
In addition to these chemical applications, there are many manufactured products which include grid-like arrays or cross hatching, such as printed or micro-replicated materials, and chip sockets. Machine vision is a useful way of inspecting such products, but, again, there must be some ability to determine the orientation and position of each element in the grid for this to be useful.
In addition, most electronic analytical systems conduct measurements for every pixel in the field of view. In non-electronic viewing systems, there has been some suggestion to study only interesting features, e.g., to save the time for cytologists, GB 2,318,367 suggests having a system automatically identify interesting features and move a slide under a microscope to jump from one interesting feature to the next, so that the cytologist need only look at the interesting features. However, past conventional electronic scanning systems generally have scanned and analyzed every pixel in the field of view. Particularly if several measurements are being made, e.g., at different frequencies or under different excitation conditions, this collection of data from every pixel can be quite time consuming, since, for example, even a typical 512xc3x97512 pixel image has 262,144 pixels, and the scanning system must be physically re-adjusted on each pixel in turn.
The present invention identifies the orientation and position in a field of view of the elements or features of a grid. The grid may have a considerable degree of variance from a purely rectilinear grid.
To achieve this according to the present invention, an image of a field of view containing a grid is first obtained and digitized into pixels. Using any suitable algorithm, individual features in the image are identified. The centroid position of each feature, the feature area in pixels and the integrated intensity of the feature all are determined. These are used to produce a xe2x80x9ccollapsed image,xe2x80x9d where each feature is represented by a point object at the centroid location, with the point object having two associated values (area and integrated intensity).
According to a first embodiment of the invention, a search line is created at one side of the image at a base angle xcex8 to the side of the image, and swept across the image in steps. At each step, the integrated intensity of each centroid within a predetermined region on either side of the line is determined and recorded. The result is a two dimensional plot with a series of peaks, each peak corresponding to a column of the grid.
Note that this process is different from simply integrating the image intensity along the search line. Due to the use of collapsed images, each feature effectively has its image intensity and area concentrated at its centroid. This will be referred to herein as xe2x80x9ccentroid integrationxe2x80x9d, and is discussed in more detail in a co-assigned U.S. patent application Ser. No. 09/422,584, filed on Oct. 21, 1999, entitled xe2x80x9cCentroid Integrationxe2x80x9d, which is incorporated herein by reference.
Centroid integration results in a set of very well defined peaks in the resulting line profile. Regular integration would result in a set of smeared out peaks and, in the case of a grid with some variation in the positions of the individual features, the result would often be unusable. As a result, centroid integration is much more tolerant of local variation in feature positions than conventional integration.
Centroid integration is repeated with two additional search lines starting at a slight variance angle (+xcex4 and xe2x88x92xcex4) to the original search line and swept across the image.
The slope of the first peak on each of the three search lines is determined. The first peak represents the first column, and the steeper the slope on that peak, the closer that search line was to being parallel to the column of the grid. If the difference between the slopes of the three lines is above a predetermined threshold (i.e., outside a tolerance limit), the line with the steepest first peak slope is identified. If it is the middle line, the process is iterated using a smaller variance angle xc2x1xcex4. If it is not the middle line, the base angle xcex8 is reset to match the angle of line with the steepest slope, and the process is iterated using the same variance angle xc2x1xcex4 around the new base angle xcex8. The entire process is iterated until the difference is within the tolerance limit. The angle of the resulting line with the steepest first peak slope will match the angle of the first column of the grid (within the predetermined tolerance limit).
This process is repeated for each peak. For example, the second peak will correspond to the second column, so the search line with the steepest slope on the second peak is the closest to the angle of the second line of the grid. Repeating centroid integration using sweep lines at a base angle xcex8 and variance angles (xc2x1xcex4) to find the slope of the second peak can define the angle of the second column. The process is repeated for each peak.
After the first peak, it is not necessary to start the sweep lines at the side of the imagexe2x80x94each sweep line can be started from position of the prior column. In addition, the angle of each column will probably be reasonably close to the angle of the prior column, so the number of sweeps usually can be minimized by starting with a base angle xcex8 matching the angle of the prior column.
Once the position and orientation of each of the columns is identified, the next step is identifying the rows generally orthogonal to the columns just identified. The rows can be identified using substantially the same process as used to identify the columns, but starting from a side of the image adjacent to the side used to find the columns. The intersections of the best fit columns and the best fit rows are determined, and used to define the nominal points in the grid.
The grid of nominal points preferably is xe2x80x9cflexedxe2x80x9d to define a final grid. To do this, a local search is performed at each nominal grid point for the local centroid, that is, the centroid nearest in position to the nominal grid point. The position of the local centroid is defined as the final grid point.
In some situations, only portions of a single object on a sample will appear in an image, with the results that multiple features are identified instead of a single feature. This might happen, for example, due to the characteristics of the object on the sample, or due to the image capture technique. An alternate aspect of the invention therefore is to find a centroid of the centroids within a predetermined distance of the nominal grid point, and to define this centroid of centroids as the final grid point. As will be apparent, if a single object is represented by multiple features, this will resolve the multiple features into an effective single feature, while if a single object is represented by a single feature, the centroid of centroids will be in the same position as the centroid of the single feature, so this will have no significant effect.
If no centroid or centroid of centroids is found within some predetermined maximum distance of a nominal grid point, it normally will be assumed that the feature (and centroid) expected at that point is missing, and the nominal grid point is retained as the final grid point.
The locations of all of the final grid points in the flexed grid provide an addressable array of points for the elements in the grid.
According to a second embodiment of the invention, the corners of a grid array are identified first. To do this, a search line is started at a corner of the image, oriented with its normal bisector pointing towards the center of the image. The search line is moved towards the center of the image. The first centroid having an area and/or integrated intensity above a certain threshold limit which is encountered by the line is identified as the corner of the grid for that line. The remaining corners of the grid are identified in similar fashion with lines starting at the other corners of the image. The nominal outline of the grid array is defined by fitting a line between the four identified corners.
Next, a search is conducted along each of the lines in the nominal outline to identify additional centroids forming the outline. Preferably, this identification includes centroids within a predetermined distance of the nominal outline, to allow for a small amount of variability in the grid vertex positions.
If there are no features outside the nominal box plus a margin, then the grid is assumed to be (a) generally rectilinear. A search line is set up parallel to one of the lines forming the nominal outline of the grid. This search line is moved across the image in a series of steps to perform centroid integration, and identify the columns in the grid. A similar search line is set up parallel to an adjacent side of the nominal outline of the grid, and moved across the image to identify the rows in the grid.
If there are features outside one side of the nominal outline of the array, then the grid is assumed to be (b) curved toward that side. In the case of a grid which is distorted by curvature along one axis, a procedure similar to that for a rectilinear grid can be used to identify the grid points, but the procedure must take the curve into account. This is done by using a curved search line which matches the curvature of the grid. Any suitable method of identifying the curve can be used, and, once the curve is identified, the lines are moved in essentially the same fashion as with the rectilinear case (a) above.
In either the rectilinear or curved case, the intersections of the resulting best fit column and rows are used to define the nominal grid points. The grid is flexed to identify the local centroids and final grid positions.
The second embodiment generally requires the entire grid to be in the image, since it starts by finding the corners of the grid. In many situations, e.g., quality control of a large, moving web in production, the grid is not entirely within the image, but the orientation of the grid in the image is known, e.g., because the placement of the field of view relative to the web is known. If so, the grid spacing and position can be identified by using a variation of the second embodiment. In this variation, no attempt is made to identify the corners of the grid, since that is not possible. Instead of using search lines oriented parallel to the sides of the nominal outline defined by the corners of the grid, the same process as in the second embodiment is undertaken using search lines oriented roughly parallel to the expected columns and rows.
The third embodiment of the present invention can handle the most difficult situations, where the grid may be distorted in multiple directions. In this embodiment, the corners of the grid are found and a nominal outline for the grid first defined as in the second embodiment. The second embodiment assumes that the sides are generally a straight line, or at most curved in a single direction, but the third embodiment makes no such assumptions.
According to the third embodiment, after establishing the nominal outline, a search is conducted within a window on either side of one of the sides of the outline to find the centroids near that line. The distances between consecutive centroids along the line are calculated, and the histogram of distances is determined. The first large peak in the histogram corresponds to the nominal spacing along that edge.
The centroids making up the side (the first row or column) then are determined by an iterative process. Starting at one corner along the side, a search is made generally in the direction of the nominal side and a distance corresponding to the nominal spacing along that side. The closest centroid is found and identified as the next edge centroid. A new search then is made starting from the just-located centroid, generally in the direction of the line between the prior centroid and the just-located centroid and out a distance corresponding to the distance between the prior centroid point and the just-located centroid. This process iterates until the centroid at the other end of the side is reached.
Having found the centroids forming the side, a best fit curve then is determined for those centroids. This curve is moved across the image to conduct centroid integration, and locate the centroids corresponding to the next row or column in the grid. If necessary, the centroids in this next row or column may be determined using the same process as described for the first row or column. However, it may be sufficient to simply generate a best fit curve of the centroids forming the centroid integration peak for the row or column. In either case, the new curve is used to find the next row or column.
The process is repeated in the other direction, to find the columns or rows, and the grid is flexed, to find the actual grid centroids.
A fourth embodiment of the invention is useful for grids with other than four sides. In this embodiment, a first search line is moved in from the periphery of the image until it intersects a centroid. This centroid is initially assumed to be a corner of the grid. The search line then is pivoted around this corner centroid to perform centroid integration. The first peak found in the centroid integration defines the side ending at that corner. A search along that side identifies the centroid at the next corner. A new search line then is pivoted about the next corner to find the third corner. The process is repeated at each subsequent corner to find each side and corner until the original centroid is reached again.
Note that very little initial information is required about the number or spacing of the rows and color in the grid, or about their orientation or position within the field of view. In the case of the first embodiment, all that is needed is to know that the grid is generally rectilinear. In the case of the second embodiment, all that is needed is to know that the grid is significantly curved in at most one direction and either that the entire grid is in the image, or that the orientation of the grid relative to the image is known. In the case of the third embodiment, all that is needed is to know that the image contains something that approximates a grid, even if heavily distorted. The fourth embodiment does not even
The process is repeated in the other direction, to find the columns or rows, and the grid is flexed, to find the actual grid centroids.
A fourth embodiment of the invention is useful for grids with other than four sides. In this embodiment, a first search line is moved in from the periphery of the image until it intersects a centroid. This centroid is initially assumed to be a corner of the grid. The search line then is pivoted around this corner centroid to perform centroid integration. The first peak found in the centroid integration defines the side ending at that corner. A search along that side identifies the centroid at the next corner. A new search line then is pivoted about the next corner to find the third corner. The process is repeated at each subsequent corner to find each side and corner until the original centroid is reached again.
It may be that a centroid identified by this technique is not actually a corner of the grid, but merely a centroid in a side significantly bowed outwards. One remedy to this situation is to assume that any corner for which the angle between the adjacent sides is greater than some pre-determined number, e.g., 65xc2x0 for a hexagon which should have a nominal 60xc2x0 corner angle, is not correctly identified as a corner. This remedy is useful if one knows the nominal shape of the grid. Another remedy requiring less prior information is to determine the angles formed at each corner, and graph them on a histogram. The peak in the histogram representing the smallest angle then is assumed to be the actual angle of the sides of the grid, and angles at least some amount (e.g., a few degrees or two standard deviations of the angles making up the largest angle peak) different than that are assumed to result from centroids which were initially, but erroneously, identified as corners. In either case, a nominal outline then is defined using only the correctly identified corners, and the grid points found through centroid integration as with any of the prior embodiments.
Note that very little initial information is required about the number or spacing of the rows and columns in the grid, or about their orientation or position within the field of view. In the case of the first embodiment, all that is needed is to know that the grid is generally rectilinear. In the case of the second embodiment, all that is needed is to know that the grid is significantly curved in at most one direction and either that the entire grid is in the image, or that the orientation of the grid relative to the image is known. In the case of the third embodiment, all that is needed is to know that the image contains something that approximates a grid, even if heavily distorted. The fourth embodiment does not even require knowing how many sides the grid has. The method of the invention will identify all other information directly.
This low need for initial information has the distinct advantage of minimizing the care that need be taken in placing a grid in the field of view prior to imaging. For example, in the case of a grid used for combinatorial chemistry assays, the grid can simply be placed in the field of view for imaging, with no particular orientation or placement within the field required. Since accurate placement and orientation is time consuming, this can significantly accelerate review of large numbers of samples.
The ability to handle even highly distorted grids allows for processing of materials with highly variable shapes, e.g., because the substrate they are made on is distorted during manufacture or use.
The foregoing analysis can be made from a single image of the field of view, without having to re-focus the system on each pixel. Further reductions in processing time can be achieved by using the information about the grid positions thus obtained to analyze just the significant locations in the field of view, not every pixel in the image. This can be done by re-adjusting the system to study just the pixels surrounding each centroid. Alternatively, the system can be re-adjusted so that the field of view substantially matches the feature around each centroid, and each feature can be analyzed in a single step. The system is re-adjusted from one feature to the next to analyze all of the features.
Typically, the number of features in a grid (e.g., 96 in a DNA grid array) is significantly lower than the total number of pixels in the image (e.g., 262,144 in a 512xc3x97512 pixel image). Even if a moderate number of pixels (say 30-50) around each centroid must be analyzed, this dramatically reduces the total number of measurements (262,144 down to 2880-4800) which must be made and analyzed. If system is re-adjusted to match each feature, the total number of measurements needed is even lower, matching the number of features at 96. In either case, total processing time can be reduced markedly from the time required to measure every pixel in the image.