Arrays of biopolymers, such as arrays of peptides or polynucleotides (such as DNA or RNA), are known and are used, for example, as diagnostic or screening tools. Such arrays include regions (sometimes referenced as features or spots) of usually different sequence biopolymers arranged in a predetermined configuration on a substrate. The arrays, when exposed to a sample, will exhibit a pattern of binding which is indicative of the presence and/or concentration of one or more components of the sample, such as an antigen in the case of a peptide array or a polynucleotide of particular sequence in the case of a polynucleotide array. The binding pattern can be detected, for example, by labeling all potential targets (for example, DNA) in the sample with a suitable label (such as a fluorescent compound), and accurately observing the fluorescence pattern on the array.
In one application, arrays of oligonucleotide probes provide useful tools for simultaneous evaluation of the levels of expression of large sets of genes (“expression profiling”). The probe arrays used in expression profiling can be produced in two ways: (i) oligonucleotide probes can be synthesized in situ on the array surface, using location-addressable adaptations of phosphoramidite chemistry (for example, photo-deprotection, or printing of phosphoramidites using an inkjet type printer); (ii) whole oligonucleotide probes synthesized either by phosphoramidite chemistry or enzymatic methods (for example, PCR) can be deposited on a surface designed to form either a strong non-covalent attachment to DNA (for example, poly-L-lysine) or a covalent attachment to a chemically unique group added to the oligonucleotide during synthesis (for example, a modified base containing a primary aliphatic amine). Noncovalent attachment may be subsequently turned into covalent attachment by methods such as UV photo-cross linking. Chemical synthesis is used to produce probes shorter than 50 nucleotides, while enzymatic methods are used to produce longer probes (100-1000 nucleotides).
Synthetic nucleotide probes (either synthesized in situ or deposited whole) can potentially discriminate between closely related mRNA's, because they can be designed to probe the most different portions of the target sequences, and because the effects of these differences are proportionately greater for shorter probes. This ability is important, because many genes in higher organisms are members of families of related genes. However, shorter probes suffer from the difficulty that they do not associate with their targets as strongly as longer probes (that is, they have a lower binding constant or binding affinity). This weaker association makes it very difficult to produce oligonucleotide probes that can unequivocally detect concentrations lower than about 0.1 pM for the best cases, and typical detection limits are in the range of 1 pM-10 pM. This results in a sensitivity gap. For example, if all of the mRNA in a sample of 106 cells (a typical size for sampling a precious specimen, such as a biopsy sample) is converted into labeled cDNA, and the resulting material is resuspended in a volume of 100 μl, then the final concentration of the cDNA derived from a message present at 1 copy per cell is
                    (                              10            6                    ⁢                                          ⁢          cells                )            ⁢              (                              1            ⁢                                                  ⁢            copy                    cell                )                            (                  6.02          ×                      10            23                    ⁢                                          ⁢                      copies            mole                          )            ⁢              (                              10                          -              4                                ⁢                                          ⁢          liters                )              =            1.66      ×              10                  -          14                    ⁢                          ⁢      M        =          0.017      ⁢                          ⁢      p      ⁢                          ⁢      M      This is a factor of 5 lower than the lowest limit of detection achieved with current oligonucleotide probe-polynucleotide target combinations, and a factor of 50-500 below more typical detection limits.
The sensitivity gap can be closed by employing a target amplification scheme, such as linear amplification by RNA transcription or asymmetric PCR. However, this adds both complication and cost to the assay. In addition, losses of mRNA during sample preparation, amplification, inhibition by sample-derived impurities and problems with probe specificity (which necessitate more stringent conditions and lower signal levels) can together use up most or all of the sensitivity margin provided by target amplification. Finally, lowering the number of cells required per sample would greatly improve the applicability of arrays, since it would then be possible to perform entire array analyses on samples provided by microsampling methods, such as needle biopsy and laser-assisted micro-dissection.
Arrays which utilize longer probes can exhibit binding constants high enough to yield detection limits in the 10−15 M range. However, this improved performance comes at the costs of lost specificity within gene families and loss of the ability of design probes to hybridize to the most unique target subsequences.
Solutions to the sensitivity gap from using synthetic oligonucleotide probes, include target amplification, signal amplification, the use of high sensitivity labels and the use of modified probe nucleotide chemistries. Target amplification, described in the previous section, is a well established method for overcoming an intrinsic binding constant that is too low. It solves the problem directly, by increasing the amount of target by a well-controlled factor that is relatively independent of the target sequence. The disadvantages of target amplification are the complication and cost added to sample preparation. Another solution is signal amplification, which is achieved by multiplying the number of detectable labels attached to a given target molecule that binds to an array feature. Many sample labeling schemes incorporate a basic form of signal amplification by the simple expedient of attaching the label (for example, a fluorophore) to one or more of the nucleotide triphosphates used by the transcription-based system that produces labeled target oligonucleotide. More elaborate schemes, such as binding of labeled biotin-streptavidin complexes and the formation of sandwiches between surface-bound probes, unlabeled targets and highly labeled second probes (for example, branched DNA probes) have also been employed. These methods, like target amplification, are relatively costly and complicated and further rely on the binding of a very small number of molecules. This can result in an added source of noise derived from the probabilistic binding of small numbers of target molecules. High sensitivity labels (for example, radioisotopes, chemiluminescent labels) are a special case of signal amplification. The main advantage of such methods is that they generate signal against a very low intrinsic background. The disadvantage is that these labels are not as convenient or safe as fluorescent probes. In addition, radioisotopes provide lower spatial resolution than optical probes.
Probes that incorporate modified bases or backbones into polynucleotides may be capable of providing much higher per base binding free energies than conventional DNA probes. The main disadvantages of this approach are the relatively poor state of development of synthetic schemes for producing probes that incorporate nucleotide analogues and the relatively poor state of characterization of the benefits derived from the use of such alternate chemistries. At present most of the performance enhancement available from modified polynucleotide chemistries is theoretical.
U.S. Pat. No. 4,731,325 describes an arrangement using two or three identifying nucleic acid fragments homologous to a nucleic acid to be identified. The patent states that if simultaneous identification of several different nucleic acids is desired, it is necessary to use separate filters to which are attached the required fragments. A paper by Gentalen et al., “A novel method for determining linkage between DNA sequences: hybridization to paired probe arrays” Nucleic Acids Research, 1999, Vol. 27, No. 6 1485-1491 describes co-operative hybridization to establish physical linkage between two loci on a DNA strand. These reference, and all other references cited in this application, are incorporated in this application by reference. However, cited references or art are not admitted to be prior art to this application.
It would be desirable then, to provide a means for detecting a target using probes, particularly in the form of an addressable array, which can provide good binding affinity for the target. It would also be desirable that any such means be relatively simple to fabricate. It would farther be desirable that a means be provided for aiding in the selection of such probes.