Microarrays of biomolecules, such as DNA, RNA, cDNA, polynucleotides, oligonucleotides (“oligomers”), proteins, and the like, are state-of-the-art biological tools used in the investigation and evaluation of biological processes, including gene expression, for analytical, diagnostic, and therapeutic purposes. Microarrays typically comprise a plurality of polymers, e.g., oligomers, synthesized in situ or presynthesized and deposited on a substrate in an array pattern. Microarrays of oligomers manufactured by solid-phase DNA synthesis can have oligomer densities approaching 106/micron2. As used herein, the support-bound oligomers are called “probes”, which function to bind or hybridize with a sample of DNA or RNA material under test, called a “target” in hybridization experiments. However, some investigators also use the reverse definitions, referring to the surface-bound oligonucleotides as targets and the solution sample of nucleic acids as probes. Further, some investigators bind the target sample under test to the microarray substrate and put the oligomer probes in solution for hybridization. Either of the “target” or “probes” may be the one that is to be evaluated by the other (thus, either one could be an unknown mixture of polynucleotides to be evaluated by binding with the other). All of these iterations are within the scope of this discussion herein. For the purpose of simplicity only, herein the probe is the surface-bound oligonucleotide of known sequence and the target is the moiety in a mobile phase (typically fluid), to be detected by the surface-bound probes. The plurality of probes and/or targets in each location in the array is known in the art as a “nucleic acid feature” or “feature”. A feature is defined as a locus onto which a large number of probes and/or targets all having the same nucleotide sequence are immobilized.
In use, the array surface is contacted with one or more targets under conditions that promote specific, high-affinity binding (i.e., hybridization) of the target to one or more of the probes. The target nucleic acids will hybridize with complementary nucleic acids of the known oligonucleotide probe sequences and thus, information about the target samples can be obtained. The targets are typically labeled with an optically detectable label, such as a fluorescent tag or fluorophore, so that the targets are detectable with scanning equipment after a hybridization assay. The targets can be labeled either prior to, during, or even after the hybridization protocol, depending on the labeling system chosen, such that the fluorophore will associate only with probe-bound hybridized targets.
After hybridization of the targets with the probe features, the array is analyzed by well-known methods. Hybridized arrays are often interrogated using optical methods, such as with a scanning fluorometer. A focused light source (usually a laser) is scanned across the hybridized array causing the hybridized areas to emit an optical signal, such as fluorescence. The fluorophore-specific fluorescence data is collected and measured during the scanning operation, and then an image of the array is reconstructed via appropriate algorithms, software and computer hardware. The expected or intended locations of probe nucleic acid features can then be combined with the fluorescence intensities measured at those locations, to yield the data that is then used to determine gene expression levels or nucleic acid sequence of the target samples. The process of collecting data from expected probe locations is referred to as “feature extraction”. The conventional equipment and methods of feature extraction are limited by their dependence upon the expected or intended location of the probe features on the substrate array, which is subject to the accuracy of the microarray manufacturing equipment.
Depending on the make-up of the target sample, hybridization of probe features may or may not occur at all probe feature locations and will occur to varying degrees at the different probe feature locations. A general problem in the feature extraction process described above is the extraction of features having weak or low fluorescence intensities, called “dim features” (i.e. will display poor intensity contrast, relative to a background) due to little or no hybridization to the target sample at those locations.
One aspect of the feature extraction problem is the location of the dim feature. If the dim feature is not accurately located on the array substrate by the manufacturing process (i.e. the probe feature is misplaced or mislocated due to the manufacturing process), the computer will not count the dim feature and inaccurate data will result. Although, it is conventional practice to provide fiduciary markings on the array substrate, for example, to which the manufacturing equipment aligns each manufacturing step, errors in the location of the features still occur. The fiduciary markings are also used during feature extraction. The optical scanning equipment aligns the light source with the array fiduciary markings and the computer aligns its predefined region for detection and analysis with the fiduciary markings on the substrate surface.
Another aspect of the feature extraction problem occurs when the probe feature that produces a weak signal after hybridization is misshapen for some reason and the computer cannot detect the irregular shape, which results in inaccurate assessment of the degree of hybridization of target to the probe feature. The common source of misshapen features is in the manufacturing process. Common misshapen feature morphologies are annular features and football-shaped features. Other, more complex morphologies, such as crescents, and defects due to scratches on the substrate surface are also observed.
Still another aspect of the feature extraction problem occurs when there are variations in the diameter of the probe feature. Variations in the diameter of a feature may result from surface chemistry problems on the surface of the substrate, such as changes in hydrophobicity of the surface. A higher than expected surface hydrophobicity will result in the feature having a smaller footprint, since the feature tends to bead up more on the more hydrophobic substrate surface. Therefore, the feature might be located in the correct place, but be only one half to three quarters of the diameter than was expected (i.e. the error is greater than 10 percent of the diameter). When the computer samples the predefined region of interest, it collects non-probe feature data in addition to the feature signal. The feature signal is degraded by the additional data.
The primary difficulty lies in the ability to determine with a level of certainty the actual position of the probe feature that gives rise to the weak signal to ensure its detection by the optical scanning equipment. A dim feature that is not located on the substrate consistently within the array pattern, may be missed during the feature extraction process, if the analysis equipment or the operator does not know the likely locations of inconsistently placed features. Therefore, this limitation in the conventional equipment and method yields less accurate results when analyzing the fluorescence data for the composition of the target sample.
As mentioned above, the density of probes on a microarray chip is ever increasing so that more genes can be analyzed at one time and thus, saves sample and reduces costs. Achieving smaller and more compact arrays will depend heavily on the manufacturing equipment and processing. It should be appreciated that as probe arrays for gene analysis become more density packed, very small errors in probe placement more severely impact the accuracy of the analysis of the hybridization results.
The problem of locating inaccurately placed probe features that result in weak signals after hybridization becomes particularly difficult as feature size decreases, because the relative importance of location errors increases at the same time that the total number of pixels in the digital array image that contain relevant data is decreasing. Extracting signal data from microarray features requires various schemes of spot finding and detrending to compensate for both position defects during manufacturing, as well as variations which occur during hybridization (bubbles, focus artifacts, surface variations, etc.) and scanning (auto focus, glass flatness, etc.).
Methods to generally locate features on a substrate are disclosed in U.S. Pat. No. 5,721,435, issued to Troll and assigned to the assignee of the present invention, which is incorporated herein by this reference. The methods of Troll include a plurality of reference markings and test spots on an array, all of which produce signals when optically scanned that are detected and evaluated to determine the location of the test spots. The reference markings have optically unique signatures to distinguish them from the signals from the test spots. The reference markings are spaced apart at known distances and serve to provide a constant calibration for the scanning equipment. The reference markings are typically laser-etched or metal-plated alignment marks that are written to the substrate surface. This method of feature location is commonly referred to as “dead-reckoning” from a mixture of design parameters and physical landmarks.
Another method to generally locate features that can be used to locate dim features is user-assisted feature extraction (“by hand”). Although these methods work well to generally locate features on a substrate, without further intervention, they are not much better at locating dim features that are mislocated (i.e., not properly placed) on the substrate by the manufacturing equipment. Dead reckoning is degraded by both uncompensated systematic location errors and random location errors. Finally, user-assisted extraction is, by definition, subjective and not automated; it is also slow, tedious and subject to errors caused by user fatigue.
Thus, it would be advantageous to have an apparatus and method to accurately locate probe features on a microarray regardless of the quantity and/or quality of the of the target-specific or hybridized probe signal. Further, it would be advantageous if the apparatus and method could be used with conventional scanning and analysis equipment. Such an apparatus and method would be particularly useful as microarrays become more densely packed.