Oligonucleotide arrays contain probes of known nucleic acid sequence on specific regions of a substrate, each region containing probes of a different nucleic acid sequence. A composition comprising target nucleic acid molecules (e.g., mRNA or cDNA from a cell) is allowed to hybridize with the probes of the array under conditions favoring hybridization of probes and target nucleic acid molecules complementing without mismatches. Unhybridized target nucleic acids are washed away and hybridization is detected.
The target molecule typically is labeled with a detectable molecule, such as a fluorophore. The presence of the target nucleic acid (and therefore hybridization) may be determined by the detection of the detectable molecule. Because the nucleic acid sequence of the probe in each region of the array is known, detection of hybridization within a region indicates that the composition contained a target nucleic acid having the complement to the probe. Furthermore, in certain situations, the level of detection correlates with the concentration of the target nucleic acid.
Gene expression profiling is a powerful tool for target discovery, gene function elucidation, drug target identification, and toxicity profiling. Oligonucleotide arrays enable one to query each of these issues with high specificity and in an expeditious manner, obviating the need for clone tracking and handling and the need for upfront PCR preparation and purification. To accurately perform gene expression profiling with an oligonucleotide array, one or more probes complementary to the gene are present in the array.
Because the probes typically are of such length that they cannot contain the entire nucleic acid sequence of the gene, probes must be chosen that are complementary to only a portion of the gene. It is preferable that the probes chosen are able to accurately indicate the expression level of the gene. Due to the differences of oligonucleotide sequences and the ramifications on hybridization kinetics and thermodynamics, all probes do not give equivalent hybridization signals even when the target nucleic acid concentrations are equal. Often multiple probes to a single gene are contained within the array to provide for greater specificity and accuracy.
However, with post-synthetic covalent attachment schemes, it is important to arrive at a limited number of probes to be dispensed per gene in order to keep costs down and gene density up, allowing more genes to be analyzed on a single array. The mechanism by which most companies arrive at this limited number of probes is by a process called rapid prototyping, in which a superset of probes is generated and hybridized to the intended target and the one which gives the highest hybridization signal is chosen. Lockhart et al., in U.S. Pat. No. 6,040,138, describe such a method. In that patent, a number of candidate probes to a target sequence are tested to determine which probe provided the strongest signal. In an attempt to account for probes that show a high background signal even in the absence of the target, Lockhart et al. compare the probe signal to a signal obtained from a second probe constructed to contain a single mismatch with the target sequence. Only those probes having a signal that is a certain percentage over the signal obtained with the mismatch probe are used. Lockhart et al. describe using multiple probes for a given target sequence in an array to accurately determine the expression level of a gene over a range of concentrations.
Ideally, an array would contain only one probe for each gene yet still would be able to provide accurate differential gene expression profiles. Because a probe giving the highest hybridization signal at a given concentration of intended target (chosen by rapid prototyping) may not always provide for accurate gene expression profiles wherein different samples have varying amounts or varying structures of the intended target, there is a need for arrays containing only a single probe to each gene yet are still able to indicate variation in the expression level of the gene.