Within the past decade, several technologies have made it possible to monitor the expression level of a large number of genetic transcripts at any one time (see, e.g., Schena et al., 1995, Science 270:467-470; Lockhart et al., 1996, Nature Biotechnology 14:1675-1680; Blanchard et al., 1996, Nature Biotechnology 14:1649; Ashby et al., U.S. Pat. No. 5,569,588, issued Oct. 29, 1996). For example, techniques are known for preparing microarrays of cDNA transcripts (see, e.g., DeRisi et al., 1996, Nature Genetics 14:457-460; Shalon et al., 1996, Genome Res. 6:689-645; and Schena et al., 1995, Proc. Natl. Acad. Sci. U.S.A. 93:10539-11286). Alternatively, high-density arrays containing thousand of oligonucleotides complementary to defined sequences, at defined locations on a surface using photolithographic techniques for synthesis in situ are described, e.g., Fodor et al., 1991, Science 251:767-773; Pease et al., 1994, Proc. Natl. Acad. Sci. U.S.A. 91:5022-5026; Lockhart et al., 1996, Nature Biotechnology 14:1675; U.S. Pat. Nos. 5,578,832; 5,556,752; and 5,510,270). Methods for generating arrays using inkjet technology for oligonucleotide synthesis are also known in the art (see, e.g., Blanchard, International Patent Publication WO 98/41531, published Sep. 24, 1998; Blanchard et al., 1996, Biosensors and Bioelectronics 11:687-690; Blanchard, 1998, in Synthetic DNA Arrays in Genetic Engineering, Vol. 20, J. K. Setlow, Ed., Plenum Press, New York at pages 111-123).
Applications of this technology include, for example, identification of genes which are up regulated or down regulated in various physiological states, particularly diseased states. Additional exemplary uses for transcript arrays include the analyses of members of signaling pathways, and the identification of targets for various drugs. See, e.g., Friend and Hartwell, International Publication No. WO 98/38329 (published Sep. 3, 1998); Stoughton, U.S. patent application Ser. No. 09/099,722 (filed Jun. 19, 1998); Stoughton and Friend, U.S. patent application Ser. No. 09/074,983 (filed May 8, 1998); Friend and Stoughton, U.S. Provisional Application Ser. Nos. 60/084,742 (filed May 8, 1998), 60/090,004 (filed Jun. 19, 1998), and 60/090,046 (filed Jun. 19, 1998).
Oligonucleotide sequences are particularly useful as probes on microarrays and in other applications that involve nucleic acid hybridization. The oligonucleotides can be custom synthesized, by techniques known in the art (see, e.g., Froehler et al., 1986, Nucleic Acid Res. 14:5399-5407; McBride et al., 1983, Tetrahedron Lett. 24:246-248), with any desired DNA sequence. Further, oligonucleotides are small enough that their thermodynamic properties (e.g., their free binding energies to complementary and/or partially complementary sequences) can be at least partially predicted. However, because of their small size, oligonucleotide probes frequently correspond to genomic sequences that are non-unique and, as a result, may hybridize to more than one polynucleotide sequence in a sample. For example, a particular oligonucleotide probe may not only hybridize to a particular mRNA transcript of interest in a sample, but may also hybridize to other homologs, analogs, splice variants or even marginally related sequence of that transcript that are also, often times in greater abundances, in a sample. As a result of such “cross-hybridization,” many oligonucleotide probes can result in false positive measurement, reflecting a lack of specificity. Conversely, an oligonucleotide probe may also hybridize to a target polynucleotide sequence of interest more weakly than predicted, e.g., from predicted hybridization binding energies. Such probes can result in false negative hybridization measurements, reflecting a lack of sensitivity.
As a result of these limitations, current microarrays require a plurality of probe pairs, which are both matched to and intentionally mismatched to a target sequence, in order to empirically distinguish signal arising from a target polynucleotide sequence of interest (e.g., a particular mRNA sequence of interest) from signal arising from cross-hybridization with other polynucleotide sequences. Currently, in situ synthesized microarray chips require more than 20 oligonucleotide probe pairs per gene or gene region reported (Lockhart et al., supra). However, unless a large number of probes is employed, such a match-mismatch scheme can only screen out cross-hybridization from distantly related sequences. In particular, the ability of such a match-mismatch scheme to distinguish between true hybridization and cross-hybridization to closely related sequences (e.g., closely related homologs and splice variants) is typically limited or even very poor. Furthermore, the “reporting density” (i.e., the number of genes detected per unit of surface area) for a microarray is limited, e.g., by the density with which polynucleotide probes may be laid down as well as by the number of polynucleotide probes required per gene. The number of polynucleotide probes that may be laid down on a microarray chip is therefore limited by the technology used to produce the microarray. Photolithographic techniques discussed above for producing oligonucleotide microarrays having a high spatial density of probes are expensive to synthesize and therefore require a large capital investment. Oligonucleotide microarrays produced using the above discussed inkjet technology methods are, by contrast, much cheaper and faster to produce both per chip design and per chip. Thus, such microarrays are generally preferred for detecting genetic transcripts in cells. However, microarray chips produced by such inkjet technology have a limited probe density that is only a fraction of the probe density of chips produced by photolithography methods. Thus, because microarrays currently known in the art must use a number of redundant probes (e.g., 20) and have limited probe density, the number of genetic transcripts that may be effectively detected on a single microarray chip is limited to about 10,000 gene transcripts using expensive, photolithographic arrays, and only about 750 to 2,500 gene transcripts using less expensive, inkjet arrays.
There exists, therefore, a need for methods which identify particular oligonucleotide sequences that may be used as both sensitive and specific probes for target polynucleotide sequences. In particular, there is a need for methods that can identify particular sequences that hybridize to a particular sequence of interest, such as the sequence of a particular gene or gene transcript, with little or no cross-hybridization to other polynucleotide sequences in a sample. There is also a need for methods to design nucleic acid arrays which have less require fewer polynucleotide probe sequences to detect individual genes of interest, and which therefore contain polynucleotide probe sequences to detect more genes of interest than do microarrays that a currently available in the art.
Discussion or citation of a reference herein shall not be construed as an admission that such reference is prior art to the present invention.