Numerous analytical methods in molecular biology, biochemistry and biophysics rely on the hybridization of a nucleic acid molecule to a complementary nucleic acid molecule. These methods include Southern and Northern blot hybridization, fluorescence in situ hybridization (FISH), gene-chip array technologies, and polymerase chain reactions (PCR). One goal of these methods is to determine the presence and/or amounts of nucleic acid molecules containing a particular nucleotide sequence(s) of interest. In general, nucleic acid molecules are labeled with a detectable marker such as a radioactive or a fluorescent marker. After sequence-specific hybridization of two nucleic acid molecules, the presence and/or level is measured using the marker.
The development of DNA “chips” which are arrays of nucleic acid molecules on a solid support such as a glass surface, has greatly increased the number of nucleotide sequences that can be assayed simultaneously within a given sample (Chee, M., et al., (1996) Science 274, 610-614). In particular, array-based hybridization assays can simultaneously monitoring of thousands of hybridization reactions from one sample allowing researchers to obtain information on the presence and quantities of numerous nucleic acids having numerous sequences of interest (Solas et al. Proc. Natl. Acad. Sci. USA 91:5022-5026, 1994; Blanchard and Hood. Nature Biotechnology. 14:1649, 1996; and DeRisi et al. Science. 278:680-686, 1997). These assays may be applied to a wide range of applications such as single-nucleotide polymorphism (SNP) genotyping, gene expression profiling, and resequencing DNA.
A typical DNA array consists of a set of oligonucleotides bound to a solid support surface such as silicon or glass. A fluorescently labeled target sample mixture of DNA or RNA fragments is brought in contact with the array and allowed to hybridize with the synthesized oligonucleotides. In theory, the conditions only allow hybridization between a region of a target molecule and a complementary oligonucleotide on the surface. Therefore, detecting areas of fluorescence on the surface provides information about the nucleic acid content in the sample. However in practice, cross-hybridization is a significant source of error in a standard array-based hybridization assay, and reduces the sensitivity and selectivity of the assay and/or introduces background signals.
In general, cross-hybridization is the undesired binding of two or more nucleic acid molecules, and depends on the specific protocol of the method employed. For DNA array based assays, one form of cross hybridization which produces experimental error and limits the power of the assay, occurs when regions of target nucleic acids in a sample hybridize to an improper site on the DNA array. Such cross hybridization may result in false positive and negative signals in the assay. Additional cross hybridization occurs when nucleic acid molecules in a sample (target) hybridize to themselves or to other nucleic acid molecules in the sample.
The control of cross-hybridization is particularly important for methods that employ massively parallel arrays of hybridization probes such as for DNA chips. Such arrays depend solely upon hybridization for specificity since there is no enzyme-based proofreading of duplexes as in methods based upon Sanger dideoxy sequencing or the polymerase chain reaction. In addition, the large number of probes reduces the ability to verify the specificity of all probe-target interactions that are detected in a given assay. Thus, the accuracy of data obtained using DNA microarrays is greatly improved by minimizing cross-hybridization. Therefore, a particularly problematic form of cross-hybridization occurs in DNA gene chip technology when a nucleic acid in a sample hybridizes to the wrong probe on the DNA chip.
Fragmentation of the target reduces cross hybridization to some degree since shorter targets offer fewer secondary sites for binding by another target molecule. However, fragmentation also decreases signal for detection of target molecules (e.g. if the target has been randomly labeled with a reporter molecule), and increases the complexity of sample preparation protocols. Fragmentation is particularly problematic in differential expression studies where two samples labeled with different reporter molecules must be reproducibly and identically fragmented.
Therefore, there is a need for methods of analyzing nucleic acid molecules that reduce undesired hybridization between two or more nucleic acid molecules.