This invention relates to methods for improving the discrimination of hybridization of target nucleic acids to probes on substrate-bound oligonucleotide arrays. Therefore, it relates to the fields of molecular biology and biophysics.
Substrate-bound oligonucleotide arrays, such as the Affymetrix DNA Chip, enable one to test hybridization of a target nucleic acid molecule to many thousands of differently sequenced oligonucleotide probes at feature densities greater than a five hundred per 1 cm.sup.2. Because hybridization between two nucleic acids is a function of their sequences, analysis of the pattern of hybridization provides information about the sequence of the target molecule. The technology is useful for de novo sequencing and re-sequencing of nucleic acid molecules. The technology also has important diagnostic uses in discriminating genetic variants that may differ in sequence by one or a few nucleotides. For example, substrate-bound oligonucleotide arrays are useful for identifying genetic variants of infectious diseases, such as HIV, or genetic diseases, such as cystic fibrosis.
In one version of the substrate-bound oligonucleotide array, the target nucleic acid is labeled with a detectable marker, such as a fluorescent molecule. Hybridization between a target and a probe is determined by detecting the fluorescent signal at the various locations on the substrate. The amount of signal is a function of the thermal stability of the hybrids. The thermal stability is, in turn, a function of the sequences of the target-probe pair: AT-rich regions of DNA melt at lower temperatures than GC-rich regions of DNA. This differential in thermal stabilities is the primary determinant of the breadth of DNA melting transitions, even for oligonucleotides.
Depending upon the length of the oligonucleotide probes, the number of different probes on a substrate, the length of the target nucleic acid, and the degree of hybridization between sequences containing mismatches, among other things, a hybridization assay carried out on a substrate-bound oligonucleotide array can generate thousands of data points of different signal strengths that reflect the sequences of the probes to which the target nucleic acid hybridized. This information can require a computer for efficient analysis. The fact of differential fluorescent signal due to differences in thermal stability of hybrids complicates the analysis of hybridization results, especially from combinatorial oligonucleotide arrays for de novo sequencing and custom oligonucleotide arrays for specific re-sequencing applications. Modifications in custom array designs have contributed to simplifying this problem. However, certain modifications, such as length variation, are not an option in combinatorial arrays. Therefore, methods of normalizing the signal between hybrids of different sequences would be very useful in applications of high density substrate-bound oligonucleotide arrays that generate large amounts of data in hybridization assays.