The invention relates generally to methods for detecting nucleic acid mutations in biological samples, and more specifically to methods for detecting nucleic acid deletions or insertions using primer extension reactions.
Numerous diseases are thought to be initiated by disruptions in genomic stability. For example, sickle cell anemia, phenylketonuria, hemophilia, cystic fibrosis, and various cancers have been associated with one or more genetic mutation(s). Increased knowledge of the molecular basis for disease has lead to a proliferation of screening assays capable of detecting disease-associated nucleic acid mutations.
One such method identifies a genomic region thought to be associated with a disease and compares the wild-type sequence in that region with the sequence in a patient sample. Differences in the sequences constitute a positive screen. See e.g., Engelke, et al, Proc. Natl. Acad. Sci., 85: 544-548 (1988). Such methods are time-consuming, costly, and often results in an inability to identify the mutation of interest. Thus, sequencing is not practical for large-scale screening assays.
A variety of detection methods have been developed which exploit sequence variations in DNA using enzymatic and chemical cleavage techniques. A commonly-used screen for DNA polymorphisms consists of digesting DNA with restriction endonucleases and analyzing the resulting fragments by means of Southern blots, as reported by Botstein et al., Am. J. Hum. Genet., 32: 314-331 (1980) and White et a., Sci. Am., 258: 40-48 (1988). Mutations that affect the recognition sequence of the endonuclease will preclude enzymatic cleavage at that site, thereby altering the cleavage pattern of the DNA. Sequences are compared by looking for differences in restriction fragment lengths. A problem with this method (known as restriction fragment length polymorphism mapping or RFLP mapping) is its inability to detect mutations that do not affect cleavage with a restriction endonuclease. One study reported that only 0.7% of the mutational variants estimated to be present in a 40,000 base pair region of human DNA were detected using RFLP analysis. Jeffreys, Cell, 18: 1-18 (1979).
Single-base mutations have been detected by differential hybridization techniques using allele-specific oligonucleotide probes. Saiki et al,Proc. Natl. Acad. Sci., 86: 6230-6234 (1989). Mutations are identified on the basis of the higher thermal stability of the perfectly-matched probes as compared to mismatched probes. Disadvantages of this approach for mutation analysis include:(1) the requirement for optimization of hybridization for each probe, and (2) the nature of the mismatch and the local sequence impose limitations on the degree of discrimination of the probes. In practice, tests based only on parameters of nucleic acid hybridization function poorly when the sequence complexity of the test sample is high (e.g., in a heterogeneous biological sample). This is partly due to the small thermodynamic differences in hybrid stability generated by single nucleotide changes. Therefore, nucleic acid hybridization is generally combined with some other selection or enrichment procedure for analytical and diagnostic purposes.
A number of detection methods have been developed which are based on template-dependent, primer extension. Those methods can be placed into one of two categories: (1) methods using primers which span the region to be interrogated for the mutation, and (2) methods using primers which hybridize upstream of the region to be interrogated for the mutation.
In the first category, U.S. Pat. No. 5,578,458 reports a method in which single base mutations are detected by competitive oligonucleotide priming under hybridization conditions that favor the binding of a perfectly-matched primer as compared to one with a mismatch. U.S. Pat. No. 4,851,331 reports a similar method in which the 3xe2x80x2 terminal nucleotide of the primer corresponds to the variant nucleotide of interest. Since mismatching of the primer and the template at the 3xe2x80x2 terminal nucleotide of the primer inhibits elongation, significant differences in the amount of incorporation of a tracer nucleotide result under normal primer extension conditions.
Methods in the second category are based on incorporation of detectable, chain-terminating nucleotides in the extending primer. Such single nucleotide primer-guided extension assays have been used to detect aspartylglucosaminuria, hemophilia B, and cystic fibrosis; and for quantifying point mutations associated with Leber Hereditary Optic Neuropathy. See. e.g., Kuppuswamy et al., Proc. Natl. Acad. Sci. USA, 88: 1143-1147 (1991); Syvanen et al., Genomics,8: 684-692 (1990); Juvonen et al., Human Genetics, 93: 16-20 (1994); Ikonen et al., PCR Meth. Applications, 1: 234-240 (1992); Ikonen et al, Proc. Natl. Acad. Sci. USA, 88: 11222-11226 (1991); Nikiforov et al., Nucleic Acids Research, 22: 4167-4175 (1994). An alternative primer extension method involving the addition of several nucleotides prior to the chain terminating nucleotide has also been proposed in order to enhance resolution of the extended primers based on their molecular weights. See e.g., Fahy et al., WO/96130545 (1996).
Strategies based on primer extension require considerable optimization to ensure that only the perfectly annealed oligonucleotide functions as a primer for the extension reaction. The advantage conferred by the high fidelity of the polymerases can be compromised by the tolerance of nucleotide mismatches in the hybridization of the primer to the template. Any xe2x80x9cfalsexe2x80x9d priming will be difficult to distinguish from a true positive signal. The reaction conditions of a primer extension reaction can be optimized to reduce xe2x80x9cfalsexe2x80x9d priming due to a mismatched oligonucleotide. However, optimization is labor intensive and expensive, and often results in lower sensitivity due to a reduced yield of extended primer.
A number of mutations leading to various forms of cancer involve the deletion of multiple nucleotides from a genomic sequence. An example is the BAT26 segment of the MSH2 mismatch repair gene. The BAT26 segment contains a long poly-A tract. In certain cancers, a characteristic 5 base pair deletion occurs in the poly-A tract. Detection of that deletion may provide diagnostic information. Accordingly, the invention provides methods for detecting deletions in genomic regions, such as BAT26 and others, which may be associated with disease.
Methods of the invention provide assays for identification of a deletion in a genomic region suspected to be indicative of disease. In general, methods of the invention comprise annealing a primer upstream of a region in which a deletion is suspected to occur, extending the primer through the region, terminating extension at a known end-point, and comparing the length and/or weight of the extended primer with that of an extended primer from the corresponding willd-type (non-affected) region or a molecular weight standard (either known or run in parallel). In preferred embodiments, the extended primer is labeled downstream of the region suspected to be deleted. In a highly-preferred embodiment, the comparative length and/or molecular weight of the extended primer is determined by gel electrophoresis or mass spectroscopy. Also in a highly-preferred embodiment, the region suspected to contain the deletion comprises a poly-nucleotide tract in which the deletion is suspected to occur, and the sequence immediately downstream of the region is known and does not repeat a nucleotide species present in the polynucleotide tract. Preferably, the polynucleotide tract comprise three, two, or preferably one, species of nucleotide as explained in detail below. Methods of the invention retain the specificity of primer extension assays while increasing their sensitivity by reducing background due to premature termination of the extension reaction. Therefore, methods of the invention provide a highly sensitive and highly specific assay for detecting a small amount of mutant nucleic acid in a heterogeneous sample of predominantly wild-type nucleic acid.
Methods of the invention provide screening assays for the detection of a deletion in a region of the genome comprising one, but no more than three, species of nucleotide, and that is characterized by having a sequence for primer hybridization immediately upstream, and a sequence immediately downstream that does not contain a nucleotide present in the region suspected to be deleted. In a preferred embodiment, methods of the invention comprise selecting a nucleic acid having a known wild-type sequence and having a region (the deletion of which is suspected in disease) comprising at most three different types of nucleotides; hybridizing an oligonucleotide primer, or pair of oligonucleotide primers, immediately upstream of the target region; extending the primer by using a polymerase in the presence of the nucleotide bases that are complementary to the nucleotide bases of the target region, thereby to form a primer extension product; further extending the primer extension product in the presence of a labeled nucleotide that is complementary to a nucleotide base downstream from the target region, but not complementary to a nucleotide base within the target region; and determining the size of the extension product compared to a standard (e.g., a wild-type product or a molecular weight standard).
In a preferred embodiment, the target region in which the deletion is suspected to occur is greater than five nucleotides long, and/or the deletion is great than three nucleotides long. In a preferred embodiment, the primer extension reactions are cycled by varying the reaction temperature through successive annealing, extending and denaturing temperatures. Preferably, the molecular weight standard is the wild-type extension product, or one that corresponds to the expected size for the extension product from the wild-type nucleic acid template. The presence of an extension product smaller than the molecular weight standard is indicative of the presence of a deletion in the target region of the nucleic acid template. In a preferred embodiment, the primer extension product is terminated by incorporating a terminator nucleotide that is complementary to a nucleotide downstream from the target region in a wild type nucleic acid, but not complementary to any of the nucleotides of the target region. In a more preferred embodiment, the labeled nucleotide and the terminator nucleotide are the same. In an alternative embodiment, more than one labeled nucleotide base is incorporated into the extension product prior to incorporation of the terminator nucleotide. Preferably, the nucleotides incorporated during extension through the region suspected of containing a deletion are unlabeled. However, if those nucleotides are labeled, they are preferably distinguishable from the labeled nucleotide that is incorporated at the 3xe2x80x2 end of the extension product.
In a preferred embodiment, methods of the invention comprise detecting a nucleic acid mutation in a biological sample, such as stool, urine, semen, blood, sputum, cerebrospinal fluid, pus, or aspirate, that contains a heterogeneous mixture of nucleic acid having a deletion in the target region and wild type nucleic acid. Such a deletion in the target region may be present in only about 1-5% of the nucleic acid molecules having the target region. To increase the sensitivity of the assay, the sample may comprise a polymerase chain reaction product. Method of the invention are particularly useful in analyzing a deletion in the target region that is indicative of the presence of cancerous or precancerous tissue in such a biological sample, including colorectal cancer or precancer detection in stool.
In another embodiment, methods of the invention comprise further extending the primer extension product in the presence of labeled and unlabled nucleotides, the nucleotides being of the same type (i.e., A, T, C, or G) and being complementary to one or more nucleotide downstream from the target region but not complementary to a nucleotide within the target region. In one embodiment the ratio of the labeled nucleotide to unlabeled nucleotide is 1:1. Methods of the invention may also include incorporating more than one monomer of the labeled nucleotide or unlabeled nucleotide into the extension product.
In another embodiment, methods of the invention comprise detecting a deletion in a sample by selecting a nucleic acid with a known wild-type sequence and having a target region suspected of containing a deletion, wherein the target region contains at most three different types of nucleotide bases selected from the group consisting of dGTP, dATP, dTTP, and dCTP; hybridizing an oligonucleotide primer to a region upstream of said target region, in a nucleic acid sample; contacting said hybridized oligonucleotide primer with an extension reaction mixture comprising:i) nucleotides which are complementary to the nucleotides in the target region, ii) a labeled nucleotide which is complementary to a nucleotide found downstream from the target region, but which is not complementary to any nucleotide base found within the target region, and iii) a terminator nucleotide which is complementary to a nucleotide found downstream from the target region, but which is not complementary to any nucleotide found in the target region; extending the hybridized oligonucleotide primer to generate a labeled extension product; and comparing the size of the labeled extension product from step d) to a molecular weight standard, wherein a labeled extension product smaller than the molecular weight standard is indicative of the presence of a deletion in the target region.
Methods of the invention are especially useful to detect indicia of cancer or precancer in a heterogeneous sample. Stool is a good example of a heterogeneous sample in which methods of the invention are useful. A typical stool sample contains patient nucleic acids, but also contains heterologous nucleic acids, proteins, and other cellular debris consistent with the lytic function of the various nucleases, proteinases and the like found in the colon. Under normal circumstances, stool solidifies as it proceeds from the proximal colon to the distal colon. As the solidifying stool passes through the colon, colonic epithelial cells are sloughed onto the stool. If a patient has a developing tumor or adenoma, cells from the tumor or adenoma will also be sloughed onto stool. Those cells, and/or their debris, will contain molecular indicia of disease (e.g., mutations or loss of heterozygosity). In the early stages of development, nucleic acid indicative of an adenoma or tumor comprise only about 1% of the nucleic acid in a voided stool. If left untreated, proportionately more disease-related nucleic acids are found in stool. Methods of the invention are useful for detecting early-stage lesions in heterogeneous samples such as stool. Methods of the invention result in a high degree of sensitivity and specificity for the detection of early-stage disease. Methods of the invention are especially useful in detecting, for example, adenomas in the colon. Adenomas are non-metastatic lesions that frequently have the potential for metastasis. If all adenomas in a patient are detected and removed, the probability of complete cure is virtually certain.
Deletions in the BAT26 locus of the MSH2 mismatch repair gene have been associated with colorectal cancer. Thus, in a highly-preferred embodiment, the region in which a deletion is suspected to occur is the BAT26 locus. That locus contains a polyA tract in which deletions have been associated with cancer or precancer. Use of methods of the invention on the BAT26 locus identifies the characteristic deletions by producing an extension product in affected DNA that is shorter than the expected wild-type extension product. Methods of the invention will be exemplified below using the BAT26 locus. However, methods of the invention are appreciated to be useful on any genetic locus in which a deletion occurs. Especially useful loci are those indicative of disease, and especially cancer.
A detailed description of certain preferred embodiments of the invention is provided below. Other embodiments of the invention are apparent upon review of the detailed description that follows.