1. Field of the Invention
The present invention relates generally to analysis of digitized images of arrays.
2. Discussion of the Related Art
Array technology has dramatically expanded the ability of researchers to perform a variety of biotechnological operations. It has proven especially valuable in applications for DNA genotyping and gene expression. By supporting simultaneous examination of up to thousands of samples, the method enables researchers to gain a better overall picture than what may be obtained from a traditional xe2x80x9cone gene in one experimentxe2x80x9d approach. Array technology also reduces the time and the volume of materials needed for an experiment. Where previously a researcher might screen an array of unknowns with a defined probe, array technology facilitates a reverse approach. Defined DNA fragments can be affixed to known positions in an array and probed with an unknown sample.
Often, a distinction is made in array technology between microarrays and macroarrays. Objects on a microarray typically have diameters of less than 200 microns, while those on a macroarray are usually larger than 300 microns. Arrays may use glass slides, micromirrors, or semiconductor chips as substrate materials. The shape, alignment, and density of objects on an array may be controlled by a variety of methods. Robotic systems that precisely deposit small quantities of solution in an ordered configuration on the substrate material are common. Photolithographic techniques, similar to those used in the semiconductor electronics industry, are also used. Masks define activation sites where chemical coupling may occur. Repeating this process to build upon previously exposed sites allows for the formation of various gene sequences at different positions throughout the array. Alternatively, a solution of biomaterial may be spread across the substrate while specific sites on the chip are electrically activated to induce chemical bonding at those locations. Often a high density array is partitioned into sub-arrays to distinguish various aspects of an experiment, to separate different tests, or due to mechanical limitations in the means by which samples are applied to the substrate.
In a typical experiment, the array is exposed to a chemical sample to induce hybridization between array object and sample compounds. A variety of detection strategies may be used to locate the specific sites on the array where hybridization has occurred. These include use of fluorescent dyes, autoradiography, radiolabels, bioelectronic detection (observance of electron transfer reactions between samples and the substrate), and laser desorption mass spectrometry. Signals emitted from objects via detection media provide information about the reactions. Signals may be measured for absolute intensities or, in the case where different colored dyes are used to detect the degree of presence of different compounds, for ratios of intensities within specific frequencies.
Measurement and analysis of signals emitted from an array of objects usually involves examination of an image of the array. The labeled samples are excited and a detector system captures an image of the emitted energy. Accuracy of subsequent signal analysis is heavily dependent on the parameters of the detector system and its ability to reproduce faithfully the image of the array. Detector pixels need to be sufficiently small so that representation of each object includes enough pixels to support statistical analysis of the intensity of the object.
In addition to issues with image accuracy, signal measurement and analysis may be further complicated by the introduction of noise onto either the array or the image. Noise may arise from variations in sources used to excite labeled samples, fluorescence scattered by adjacent samples (xe2x80x9cbloomingxe2x80x9d effect), dust and other impurities on the array, and other sources. Signal strength may be limited by the size of the objects, irregularities in preparing sites on the array, shortfalls of sample compounds deposited at certain locations, uneven application of detection media, and other causes. Fluctuations in the intensity of energy emitted by the background of the image can cloud the distinction between signal and noise objects and add to the difficulty of image analysis.
Image analysis is often aided by use of a grid overlay to assist the researcher in locating signal objects and identifying noise. However, alignment of the grid overlay often must be adjusted for the particular array, especially where a standard grid overlay must be modified to create sub-grids that correspond to underlying sub-arrays. When done manually, grid alignment can take days. Additionally, errors in human judgment in performing this process can substantially diminish the likelihood that other researchers will be able to reproduce the results of an experiment.
Digital image processing software has helped to mitigate some of the concerns with matching the grid overlay to the array. Orientation markers included on array substrates and captured in the image can be used to align grid overlays on the image. A typical algorithm to distinguish sub-grids detects signals from objects, determines the xe2x80x9ccenter of massxe2x80x9d for each object, identifies corner coordinates for each sub-array, and aligns sub-grids accordingly.
Measurement of background intensity usually involves distinguishing background pixels from object pixels. Often it is assumed that those pixels nearest the edge of the image represent the background of the image. However, this is not always the case, particularly for high density arrays. Where an image consists of relatively low intensity objects against a high intensity background, a histogram of the image will likely be unimodal in character and provide useful information for distinguishing background pixels from object pixels. With an accurate characterization of background intensity, objects on the image may be identified. Often, objects are defined simply as those pixels with a value of intensity greater than a multiple of the standard deviation of the distribution of intensities for background pixels added to the mean value of this distribution.
Each of these methods for processing the digital image of an array has benefits and disadvantages. While the processes described in the current art have admirably employed statistical tools in the course of their analyses, they have not included other mathematical methods, such as evaluations of fourier transform spectrums, in their overall schemes. Also, current methods have taken a xe2x80x9cbackground firstxe2x80x9d approach in which objects are defined in terms of xe2x80x9cnot being backgroundxe2x80x9d. On whole, development within the art has tended to focus on identifying and correcting individual impediments to accurate image analysis. What has been needed is a comprehensive scheme to integrate a variety approaches into a single optimal method that can be used with any array regardless of object size or density, corrects the orientation of the array on the image, flattens fluctuations in background intensity, removes noise, and adjusts its efficacy to variations in background and noise intensity.
The present invention provides a method, system, and product for analyzing a digitized image of an array to create an image of a grid overlay. It is designed to function on images of arrays of various sizes and densities and with different levels of noise intensity. In general, the invention locates the center of each object on the array, determines a standard shape and size for the objects, and creates a final image of the grid overlay. In a preferred embodiment, preliminary procedures normalize the image so that optimal results can be obtained, sub-grids are identified and objects are repositioned within their corresponding sub-grids, and noise is removed by filtering processes based on object size, intensity, and location. Unlike prior art approaches, the present invention takes an xe2x80x9cobjects firstxe2x80x9d approach by initially identifying objects on the image and then defining the background in terms of xe2x80x9cnot being objectsxe2x80x9d.
Accurately ascertaining the location, size, and shape of objects on an array is critical for precise measurement of array data. The present invention advances the state of the art by eliminating the need for orientation markers, by incorporating full array and frequency analysis operations into the process in addition to local measurement techniques, and by reducing the requirement for manual intervention and the errors it introduces. The present invention enables objects on an array to be characterized more quickly, more consistently, and more accurately than the prior art.
In a preferred embodiment, preliminary procedures normalize the image so that optimal results can be obtained. First, a determination is made as to whether the image is negative or positive. A histogram is prepared of the original image and a third order moment of the histogram is calculated. If the third order moment is negative, the image is converted to positive. After establishing a positive image, the background of the image is corrected to reduce fluctuations in its intensity. Object pixels are distinguished from background pixels and the values of the background pixels are set equal to a common value. On a background-corrected image, the orientation of the array is adjusted so that the array aligns with the borders of the image. A desired angle of orientation between the rows and columns of the array and the boundaries of the image is selected. A fast fourier transform spectrum is created for the image. The maximum frequency is determined from the fast fourier transform spectrum and used to calculate the actual angle of orientation between the rows and columns of the array and the boundaries of the image. Thereafter, the image is rotated to a selected desired angle of orientation. This same fast fourier transform spectrum can be used to measure the level of noise on the image. (Therefore, these two procedures can be performed in tandem.) In measuring the level of noise on the image, an average of the pixel values is determined for a middle portion of one of the four quadrants of the fast fourier transform spectrum. With this average a threshold noise level is set for use in subsequent filtering processes.
These preliminary procedures create an intermediate image that will optimize the results of subsequent filtering processes. Because the present invention is designed to function on images of arrays of various sizes and densities, filter size parameters, based on the size of objects on the array, are calculated. The image is lowpass filtered in one dimension to create an image of linear bands. A line of pixels is formed substantially perpendicular to these linear bands. A profile is developed of the line of pixels and a fast fourier transform spectrum is prepared for the profile. The maximum frequency is determined from the fast fourier transform spectrum and used to determine a filter size parameter appropriate for the size of the objects on the array. The same process is repeated for the other dimension. In conjunction with the procedure for calculating filter size parameters, boundaries for the array are defined to prevent the inclusion of background pixels located on the outer edges of the image in subsequent filtering processes.
Once preliminary procedures have been completed, single pixel centers of objects are located through a series of filtering processes that also serves to remove noise based on object size and intensity. Using these single pixel centers, horizontal and vertical polynomial segments are created that approximate the lines that pass through the single pixel centers. Polynomial segments are used in recognition of the possibility that the single pixel centers may not lie on the same line. Therefore, defining the positions of the single pixel centers as points on a polynomial function increases the accuracy of the location of objects on the final grid overlay. Additionally, the spacing between polynomial segments is used to identify sub-grids within the array. Once sub-grids are identified, first, single pixel centers are aligned and, then, objects are repositioned with respect to their corresponding sub-grids. By comparing single pixel centers with objects, a final effort is made to remove noise based on object location on the image. Using the remaining objects, a standard shape and size is determined for objects and the final image of the grid overlay is created in which each object has said standard shape and size and is positioned in relation to its corresponding said center.