The determination of the DNA base sequence of the human genome will have a major impact on biomedical science in the next century. The completion of the first complete human DNA enhances a range of applications from genetic mapping of disease-associated genes to diagnostic tests for disease susceptibility and drug response. The detection of the base composition at specific, variable DNA sites, such as single nucleotide polymorphisms (SNPs), insertion/deletion events, repeats, and the like, is especially important. The current generation of sequence detection methods are too slow and costly to meet large-scale DNA analysis requirements. Thus, there is a need for faster, more efficient methods for detecting specific genetic sequences, for example SNPs and other variable sites, i.e., for “scoring” the actual base identities at specific sites.
SNPs have a number of uses in mapping, disease gene identification, and diagnostic assays. All of these applications involve the determination of the specific base composition at the SNP site. Detection strategies for biological agents will increasingly make use of sequence information. Conventional sequencing can provide this information, but is impractical for screening a large number of sites in a large number of individuals. Several alternative methods have been developed to increase throughput.
Two techniques have been developed to determine base composition at a single site: minisequencing (See, e.g., “Minisequencing: A Specific Tool For DNA Analysis And Diagnostics On Oligonucleotide Arrays,” by Tomi Pastinen et al., Genome Research 7, 606 (1997)), and oligo-ligation (See, e.g., “Single-Well Genotyping Of Diallelic Sequence Variations By A Two-Color ELISA-Based Oligonucleotide Ligation Assay,” by Vincent O. Tobe et al., Nucleic Acids Res. 24, 3728 (1996)). In minisequencing, an oligonucleotide primer is designed to interrogate a specific site on a sample template, and polymerase is used to extend the primer with a labeled dideoxynucleotide. In oligo-ligation, a similar oligonucleotide primer is designed, and ligase is used to covalently attach a downstream oligo that is variable at the site of interest. In each case, the preference of an enzyme for correctly base-paired substrates is used to discriminate the base identity that is revealed by the covalent attachment of a label to the oligonucleotide. In most applications these assays are configured with the oligonucleotide immobilized on a solid substrate, including microplates, magnetic beads and recently, oligonucleotides microarrayed on microscope slides. Detection strategies known to those skilled in the art include direct labeling with fluorescence detection or indirect labeling using biotin and a labeled streptavidin with fluorescent, chemiluminescent, or absorbance detection.
Oligonucleotide microarrays or “DNA chips” have generated much attention for their potential for massively parallel analysis. The prospect of sequencing tens of thousands of bases of a small sample in just a few minutes is exciting. At present, this technology has limited availability because arrays to sequence only a handful of genes are currently available, with substantial hardware and consumable costs. In addition, the general approach of sequencing by hybridization is not particularly robust, with the requirement of significant sequence-dependent optimization of hybridization conditions. Nonetheless, the parallelism of an “array” technology is very powerful, and multiplexed sequence determination is an important element of the new flow cytometry method.
U.S. Pat. No. 6,287,766, issued Sep. 11, 2001, and incorporated herein by reference, teaches minisequencing by flow cytometry, a technology that has the potential to meet the current and future demands for low cost, high throughput assays for genetic variation assessment. However, because it is so efficient, the technology creates its own limiting problem, which is a severe bottleneck upstream of the assay. The bottleneck is created by the need to amplify by polymerase chain reaction (PCR) individual DNA fragments that contain nucleotide regions or sites known to be variable among individuals. Other current technologies have severe limitations that make an upstream bottleneck not as noticeable, but will eventually limit all large-scale efforts that are simply a more parallel approach of less efficient strategies.
Most SNP scoring (i.e., identification) technologies are “one at a time” assays that are performed either as parallel, singleplex assays (such as the oligo-ligation assay in 96 or 384 wells of a microtiter plate) or are single-to-moderately parallel assays with high serial throughput (e.g., mass spectrometry platform). Technologies that would permit scoring SNPs located at many different sites simultaneously (high throughput of multiplex) are needed to meet current and projected throughput and cost requirements. The flow cytometry platform of the '766 patent has such potential, and represents a substantial increase toward meeting these throughput needs.
For example, a SNP scoring project that will score 100 SNPs on 96 individuals would require two runs of a 50-plex minisequencing assay for each individual. At the slowest rates of the assay, this requires approximately 3 to 6 hours to perform. In contrast, the PCR amplification of the individual fragments containing the sites to be scored would require 100 separate PCR reactions per individual, followed by a pooling of these products for the scoring assay. The amplification steps alone would occupy one 96-well PCR machine for over 200 hours. Assuming around-the-clock operation to change out the samples every two hours, more than 8 days of operation of a PCR machine would be needed to feed the flow cytometer for 3 to 6 hours.
This disparity will become even more apparent as larger multiplexed microsphere bead sets become available; 100 bead sets are currently commercially available, while an alternative labeling strategy (Q-dots) that projects over 1 million beads within a multiplexed set should be available within 1–2 years, or sooner. Such bead sets allow individual beads to be uniquely identified. Clearly a better means of providing template DNA for these assays or a better assay needs to be developed. As mentioned previously, the PCR bottleneck is common to nearly all SNP scoring technologies, but other technologies do not have the throughput of the flow cytometry platform that would make this a practical limitation. Accordingly, there is a need to eliminate or improve the throughput of processes upstream of the flow cytometry platform.
Among the many SNP scoring technologies, the only one with the sensitivity to use unamplified, genomic DNA as the substrate is the Invader assay (Third Wave Technologies, Madison, Wis.). This technology uses a two-stage signal amplification strategy that provides a level of sensitivity that allows for analyses to be performed on genomic DNA without PCR amplification of the region around the SNP. Unfortunately, the ability to multiplex the Invader assay is limited to only a few SNPs per assay. In addition, Invader assays are currently technically difficult, and require careful design of oligonucleotides to insure that multiple assays all occur optimally at the same temperature. These constraints severely limit the use of Invader assays in high throughput drug discovery strategies and have not gained widespread use. Another technology, the TaqMan assay (PE ABI, Foster City Calif.) allows simultaneous PCR amplification and SNP scoring in real time (kinetic PCR, as opposed to end-point detection). Unfortunately, as with the Invader assay, the ability to multiplex TaqMan assays is limited to only a few sites, and is typically only performed on single SNPs or other targets. None of the existing technologies has the throughput capacity that makes the PCR bottleneck as evident as the flow cytometry-based SNP assays of the '766 patent.
U.S. Pat. No. 4,988,617, issued January 1991 to Landegren et al. (“Landegren”), and incorporated herein by reference, describes a method for determining a nucleic acid sequence in a region with a known polymorphism. This method comprises the steps of performing oligonucleotide ligation with one labeled and one tagged oligonucleotide, followed by capture onto a solid support. This solid support can be a membrane (e.g., nitrocellulose or nylon), or a well of a microtiter plate, or a microsphere. Landegren et al. propose that PCR amplification of the target DNA be done prior to annealing the probe elements to the target sequence.
Various advantages, and novel features of the invention will be set forth in part in the description which follows, and in part will become apparent to those skilled in the art upon examinations of the following or may be learned by practice of the invention. The objects and advantages of the invention may be realized and attained by means of the instrumentalities and combinations particularly pointed out in the appended claims.