The invention relates generally to methods of detecting cancer, precancer, or other diseases or disorders using nucleic acid markers.
Numerous diseases are associated with disruptions in genomic stability. For example, sickle cell anemia, phenylketonuria, hemophilia, cystic fibrosis, and various cancers have been associated with one or more genetic mutation(s). Cancer is thought to arise from a multi-step process that typically involves multiple genetic mutations leading to uncontrolled cell growth. Many cancers are curable if detected early in their development. For example, colorectal cancers typically originate in the colonic epithelium, and are not extensively vascularized (and therefore not invasive) during early stages of development. The transition to a highly-vascularized, invasive and ultimately metastatic cancer commonly takes ten years or longer. If the presence of cancer is detected prior to extensive vascularization, surgical removal typically is an effective cure. However, colorectal cancer is often detected only upon manifestation of clinical symptoms, such as pain and bloody stool. Generally, such symptoms are present only when the disease is well established, and often after metastasis has occurred. Similarly, with the exception of the Pap smear for detection of pre-malignant cervical lesions, diagnostic screening methods for other types of cancer are best at detecting established disease. Increased knowledge of the molecular basis for disease has lead to a proliferation of screening assays capable of detecting disease-associated nucleic acid mutations.
A variety of detection methods have been developed which exploit sequence variations in DNA using enzymatic and chemical cleavage techniques. A commonly-used screen for DNA polymorphisms consists of digesting DNA with restriction endonucleases and analyzing the resulting fragments by means of Southern blots, as reported by Botstein et al., Am. J Hum. Genet., 32: 314-331 (1980) and White et al., Sci. Am., 258: 40-48 (1988). Mutations that affect the recognition sequence of the endonuclease will preclude enzymatic cleavage at that site, thereby altering the cleavage pattern of the DNA. Thus, a difference in restriction fragment lengths is indicative of the presence of a mutation in the recognition sequence. A problem with this method (known as restriction fragment length polymorphism mapping or RFLP mapping) is its inability to detect a mutation outside of the recognition sequence and which, consequently, does not affect cleavage with a restriction endonuclease. One study reported that only 0.7% of the mutational variants estimated to be present in a 40,000 base pair region of human DNA were detected using RFLP mapping. Jeffreys, Cell, 18: 1-18 (1979).
Single-base mutations have been detected by differential hybridization techniques using allele-specific oligonucleotide probes. Saiki et al., Proc. Natl. Acad. Sci., 86: 6230-6234 (1989). Mutations are identified on the basis of the higher thermal stability of the perfectly-matched probes as compared to mismatched probes. Disadvantages of this approach for mutation analysis include the requirement for optimization of hybridization for each probe, and the limitations imposed by the nature of the mismatch and the local sequence on the degree of discrimination of the probes. In practice, tests based only on parameters of nucleic acid hybridization function poorly when the sequence complexity of the test sample is high (e.g., in a heterogeneous biological sample). This is due partly to the small thermodynamic differences in hybrid stability generated by single nucleotide changes. Therefore, nucleic acid hybridization is generally combined with some other selection or enrichment procedure for analytical and diagnostic purposes.
Recently, a number of genetic mutations, including alterations in the BAT-26 segment of the MSH2 mismatch repair gene, the p53 gene, the Kras oncogene, and the APC tumor suppressor gene have been associated with the multi-step pathway leading to cancer. The BAT-26 segment contains a long poly-A tract. In certain cancers, a characteristic 5 base pair deletion occurs in the poly-A tract. Detection of that deletion may provide diagnostic information. For example, it has been suggested that mutations in those genes might be a basis for molecular screening assays for the early stages of certain types of cancer. See e.g., Sidransky, et al., Science, 256: 102-105 (1992). Attempts have been made to identify and use nucleic acid markers that are indicative of cancer. However, even when such markers are found, using them to screen patient samples, especially heterogeneous samples, has proven unsuccessful either due to an inability to obtain sufficient sample material, or due to the low sensitivity that results from measuring only a single marker. For example, simply obtaining adequate human DNA from one type of heterogeneous sample (stool) has proven difficult. See Villa, et al., Gastroenterol., 110: 1346-1353 (1996) (reporting that only 44.7% of all stool specimens, and only 32.6% of stools from healthy individuals produced sufficient DNA for mutation analysis). Other reports in which adequate DNA has been obtained have reported low sensitivity in identifying a patient""s disease status based upon a single cancer-associated mutation. See Eguchi, et al., Cancer, 77: 1707-1710 (1996) (using a p53 mutation as a marker for cancer).
Therefore, there is a need in the art for high-sensitivity, high-specificity assays for the detection of molecular indicia of cancer, pre-cancer, and other diseases or disorders, especially in heterogeneous samples. Accordingly, the invention provides methods for detecting deletions in genomic regions, such as BAT-26 and others, which may be associated with disease.
Methods of the invention provide assays for identification of a mutation in a genomic region suspected to be indicative of disease. In general, methods of the invention comprise annealing a primer upstream of a region in which, for example, a deletion is suspected to occur, extending the primer through the region, terminating extension at a known end-point, and comparing the length and/or weight of the extended primer with that of an extended primer from the corresponding wild-type (non-affected) region or a molecular weight standard (either known or run in parallel). Also according to the invention, assays described herein are combined with invasive detection methods to increase sensitivity of detection.
Methods of the invention further provide for the determination of whether a target point mutation is present at a genetic locus of interest. In one embodiment, the invention comprises contacting a nucleic acid in a biological sample with a primer that is complementary to a portion of a genetic locus, extending the primer in the presence of a labeled nucleotide that is complementary to a target nucleotide suspected to be present at the target position. The primer is further extended in the presence of a terminator nucleotide that is complementary to a nucleotide downstream from the target nucleotide, but is not complementary to the target nucleotide, thereby generating an extension product. The presence of a labeled nucleotide in the extension product is indicative of the presence of the target point mutation at the genetic locus.
In addition, methods of the invention provide for the identification of a target single nucleotide polymorphic variant present at a genetic locus of interest. In one embodiment, the method comprises contacting a nucleic acid in a biological sample with a primer, extending the primer in the presence of at least a first and a second differentially labeled nucleotide, the first labeled nucleotide being complementary to a first nucleotide suspected to be present at said target position, the second labeled nucleotide being complementary to a second nucleotide alternatively suspected to be present at the target position. The primer is further extended in the presence of a terminator nucleotide that is complementary to a nucleotide downstream from the target position, wherein the terminator nucleotide is not complementary to the first or second nucleotides, thereby generating an extension product. The identity of the labeled nucleotide present in the extension product is indicative of the identity of the target single nucleotide polymorphic variant present at the genetic locus.
In yet another embodiment, the first labeled nucleotide comprises a first acceptor molecule and the second labeled nucleotide comprises a second acceptor molecule with the first acceptor molecule being different from the second acceptor molecule. Also, the primer comprises a donor molecule being capable of activating the first and second acceptor molecules so as to produce a first and a second detectable signal.
Furthermore, the methods of the invention provide the determination of whether a target single nucleotide polymorphic variant is present at a genetic locus of interest. For example, the method comprises contacting a nucleic acid in a biological sample with a primer, extending the primer in the presence of a labeled nucleotide that is complementary to a nucleotide suspected to be present at the target position, further extending the primer in the presence of a terminator nucleotide that is complementary to a nucleotide downstream from the target nucleotide, wherein the terminator nucleotide is not complementary to the target nucleotide, thereby to produce an extension product; and determining whether the labeled nucleotide is present in the extension product, thereby determining whether the target single nucleotide polymorphic variant is present at the genetic locus.
Moreover, the methods of the invention provides the quantification of the number of a nucleic acid having a target nucleotide present at a genetic locus of interest. In general, the method comprises contacting a nucleic acid in a biological sample with a primer, extending the primer in the presence of a labeled nucleotide that is complementary to target nucleotide, further extending the primer in the presence of a terminator nucleotide that is complementary to a nucleotide downstream from the target nucleotide, wherein the terminator nucleotide is not complementary to the target nucleotide, thereby to form an extension product, and enumerating the number of extension products that comprise the labeled nucleotide, thereby determining the number of nucleic acids having the target nucleotide at the genetic locus.
In preferred embodiments, an extended primer produced in methods of the invention is labeled downstream of the region suspected to contain a mutation. In a preferred embodiment, the comparative length and/or molecular weight of the extended primer is determined by gel electrophoresis or mass spectroscopy. Also in a preferred embodiment, the region suspected to contain the mutation comprises a poly-nucleotide tract in which a deletion is suspected to occur, and the sequence immediately downstream of the region is known and does not repeat a nucleotide species present in the polynucleotide tract. Preferably, the polynucleotide tract comprise three, two, or preferably one, species of nucleotide as explained in detail below. Methods of the invention retain the specificity of primer extension assays while increasing their sensitivity by reducing background due to premature termination of the extension reaction. Therefore, methods of the invention provide a highly sensitive and highly specific assay for detecting a small amount of mutant nucleic acid in a heterogeneous sample of predominantly wild-type nucleic acid.
Methods of the invention provide screening assays for the detection of a deletion in a region of the genome comprising at least one, but no more than three, species of nucleotide, and that is characterized by having a sequence for primer hybridization immediately upstream, and a sequence immediately downstream that does not contain a nucleotide present in the region suspected to be deleted. In a preferred embodiment, methods of the invention comprise selecting a nucleic acid having a known wild-type sequence and having a region (the deletion of which is suspected in disease) comprising at most three different types of nucleotides; hybridizing an oligonucleotide primer, or pair of oligonucleotide primers, immediately upstream of the target region; extending the primer by using a polymerase in the presence of the nucleotide bases that are complementary to the nucleotide bases of the target region, thereby to form a primer extension product; further extending the primer extension product in the presence of a labeled nucleotide that is complementary to a nucleotide base downstream from the target region, but not complementary to a nucleotide base within the target region; and determining the size of the extension product compared to a standard (e.g., a wild-type product or a molecular weight standard).
For purposes of the present invention a xe2x80x9cmutationxe2x80x9d includes a deletion, addition, substitution, transition, transversion, rearrangement, and translocation in a nucleic acid, as well as a loss of heterozygosity. A loss of heterozygosity is a form of mutation in which all or a portion of one allele is deleted. Also for purposes of the present invention, the terms xe2x80x9cmarkersxe2x80x9d, xe2x80x9ctargetsxe2x80x9d, and xe2x80x9cmutationsxe2x80x9d include nucleic acid (especially DNA) mutations, as well as other nucleic acid indicia useful in methods of the invention, such as specific alleles and single nucleotide polymorphism variants. Such indicia also include the amount of amplifiable nucleic acid in a sample, the integrity and/or length of nucleic acids in a sample, the ratio of high integrity nucleic acids (greater than about 200 base pairs) to low integrity nucleic acids (less than about 200 base pairs), and any other nucleic acid variations that differ between patients with cancer and disease-free patients.
In a preferred embodiment, the target region in which a deletion is suspected to occur is greater than five nucleotides long, and/or the deletion is greater than three nucleotides long. In a preferred embodiment, the primer extension reactions are cycled by varying the reaction temperature through successive annealing, extending and denaturing temperatures. Preferably, the molecular weight standard is the wild-type extension product, or one that corresponds to the expected size for the extension product from the wild-type nucleic acid template. The presence of an extension product smaller than the molecular weight standard is indicative of the presence of a deletion in the target region of the nucleic acid template. In a preferred embodiment, the primer extension product is terminated by incorporating a terminator nucleotide that is complementary to a nucleotide downstream from the target region in a wild type nucleic acid, but not complementary to any of the nucleotides of the target region. In a more preferred embodiment, the labeled nucleotide and the terminator nucleotide are the same. In an alternative embodiment, more than one labeled nucleotide base is incorporated into the extension product prior to incorporation of the terminator nucleotide. Preferably, the nucleotides incorporated during extension through the region suspected of containing a deletion are unlabeled. However, if those nucleotides are labeled, they are preferably distinguishable from the labeled nucleotide that is incorporated at the 3xe2x80x2 end of the extension product.
In a preferred embodiment, methods of the invention comprise detecting a nucleic acid mutation in a biological sample, such as stool, urine, semen, blood, sputum, cerebrospinal fluid, pus, or aspirate, that contains a heterogeneous mixture of nucleic acid having a deletion in the target region and wild type nucleic acid. Such a mutation in the target region may be present in only about 1-5% of the nucleic acid molecules having the target region. To increase the sensitivity of the assay, the sample may comprise a polymerase chain reaction product. Method of the invention are particularly useful in analyzing a deletion in the target region that is indicative of the presence of cancerous or precancerous tissue in such a biological sample, including colorectal cancer or precancer detection in stool. In another embodiment, methods of the invention comprise further extending the primer extension product in the presence of labeled and unlabeled nucleotides, the nucleotides being of the same type (i. e., A, T, C, or G) and being complementary to one or more nucleotide downstream from the target region but not complementary to a nucleotide within the target region. In one embodiment, the ratio of the labeled nucleotide to unlabeled nucleotide is 1:1. Methods of the invention may also include incorporating more than one monomer of the labeled nucleotide or unlabeled nucleotide into the extension product.
In another embodiment, methods of the invention comprise detecting a deletion in a sample by selecting a nucleic acid with a known wild-type sequence and having a target region suspected of containing a deletion, wherein the target region contains at most three different types of nucleotide bases selected from the group consisting of dGTP, dATP, dTTP, and dCTP; hybridizing an oligonucleotide primer to a region upstream of said target region in a nucleic acid sample; contacting said hybridized oligonucleotide primer with an extension reaction mixture comprising: i) nucleotides which are complementary to the nucleotides in the target region, ii) a labeled nucleotide which is complementary to a nucleotide found downstream from the target region, but which is not complementary to any nucleotide base found within the target region, and iii) a terminator nucleotide which is complementary to a nucleotide found downstream from the target region, but which is not complementary to any nucleotide found in the target region; extending the hybridized oligonucleotide primer to generate a labeled extension product; and comparing the size of the labeled extension product to a molecular weight standard, wherein a labeled extension product smaller than the molecular weight standard is indicative of the presence of a deletion in the target region.
In another embodiment, methods of the invention comprise single base extension assays that detect low-frequency molecular events in a biological sample. Methods for detecting low-frequency molecular events in a biological sample are provided in U.S. Pat. No. 4,683,202, the disclosure of which is incorporated by reference herein. Specific nucleic acids may be detected in a biological sample with both high sensitivity and high specificity. In general, methods of the invention comprise performing a single-base extension reaction utilizing donor and acceptor molecules which interact to produce a detectable signal.
The nucleotides comprise an acceptor molecule which interacts with a donor molecule on the primer when in close proximity and thus facilitates detection of the extended primers, or extended short first probes in an extension reaction. The donor and acceptor molecules may comprise a fluorophore. In preferred embodiments, the donor and acceptor molecules comprise a fluorescent dye such 6-carboxyfluorescein (FAM, Amersham), 6-carboxy-X-rhodamine (REG, Amersham), N1,N1N1,N1-tetramethyl-6-carboxyrhodamine (TAMARA, Amersham), 6-carboxy-X-rhodomine (ROX, Amersham), fluorescein, Cy5(copyright) (Amersham) and LightCycler-Red 640 (Roche Molecular Biochemicals). In a preferred embodiment, the donor molecules comprise FAM and the acceptor molecules comprise REG, TAMARA or ROX. In an alternate embodiment, the donor is fluoroscein and the acceptor is Cy5(copyright) or LightCycler-Red 640 (Roche Molecular Biochemicals). Alternatively, the donor and acceptor molecules comprise fluorescent labels such as the dansyl group, substituted fluorescein derivatives, acridine derivatives, coumarin derivatives, pthalocyanines, tetramethylrhodamine, Texas Red(copyright), 9-(carboxyethyl)-3-hydroxy-6-oxo-6H-xanthenes, DABCYL(copyright), BODIPY(copyright) (Molecular Probes, Eugene, Oreg.) can be utilized. Such labels are routinely used with automated instrumentation for simultaneous high throughput analysis of multiple samples.
Fluorescence monitoring of amplification is based on the concept that a fluorescence resonance energy transfer occurs between two adjacent fluorophores and a measurable signal is produced. When an external light source, such as a laser or lamp-based system is applied, the donor molecule is excited and it emits light of a wavelength that in turn excites an acceptor molecule that is in close proximity to the donor molecule. The acceptor molecule then emits an identifiable signal (i.e., a fluorescent emission at a distinct wavelength) that can measured and quantified. The donor molecule does not transmit a signal to acceptor molecules that are not in close proximity. Thus, when the ddNTP incorporates into the primer, the donor and acceptor molecules are brought close together and a fluorescence energy transfer occurs between the two fluorophores causing the acceptor molecule to emit a detectable signal. Acceptor molecules that are in close proximity to donor molecule emit a signal that is distinctly different from the acceptor molecules alone (i.e., an acceptor molecule that is not in proximity with the donor). In addition, multiple different acceptor molecules may be used, in which each acceptor xe2x80x9ccombinesxe2x80x9d with the same donor molecule to produce distinct signals, each being characteristic of a specific donor-acceptor combination. Monitoring the fluorescence emission from the acceptor fluorophore after excitation of the donor fluorophore allows highly sensitive product analysis.
Methods of the invention are especially useful to detect indicia of cancer or precancer in a heterogeneous sample. Stool is a good example of a heterogeneous sample in which methods of the invention are useful. A typical stool sample contains patient nucleic acids, but also contains heterologous nucleic acids, proteins, and other cellular debris consistent with the lytic function of the various nucleases, proteinases and the like found in the colon. Under normal circumstances, stool solidifies as it proceeds from the proximal colon to the distal colon. As the solidifying stool passes through the colon, colonic epithelial cells are sloughed onto the stool. If a patient has a developing tumor or adenoma, cells from the tumor or adenoma will also be sloughed onto stool. Those cells, and/or their debris, will contain molecular indicia of disease (e.g., mutations or loss of heterozygosity). In the early stages of development, nucleic acid indicative of an adenoma or tumor comprise only about 1% of the nucleic acid in a voided stool. If left untreated, proportionately more disease-related nucleic acids are found in stool. Methods of the invention are useful for detecting early-stage lesions in heterogeneous samples such as stool. Methods of the invention result in a high degree of sensitivity and specificity for the detection of early-stage disease. Methods of the invention are especially useful in detecting, for example, adenomas in the colon. Adenomas are non-metastatic lesions that frequently have the potential for metastasis. If all adenomas in a patient are detected and removed, the probability of complete cure is virtually certain.
The methods of the present invention also exploit the discovery that mutations in the BAT-26 segment of the MSH2 mismatch repair gene are closely associated with inherited cancers (and pre-cancerous lesions). In particular, BAT-26 mutations are highly-associated with Hereditary Non-Polyposis Colorectal Cancer (xe2x80x9cHNPCCxe2x80x9d) (i.e., in greater than 90% of patients), making BAT-26 an ideal marker for screening assays to detect this colorectal cancer, or colorectal adenoma that may or may not develop into cancer. Use of methods of the invention on the BAT-26 locus identifies the characteristic deletions by producing an extension product in affected DNA that is shorter than the expected wild-type extension product. Methods of the invention will be exemplified below using the BAT-26 locus. However, methods of the invention are appreciated to be useful on any genetic locus in which a deletion occurs. Especially useful loci are those correlated with disease, and especially cancer.
Furthermore, BAT-26 mutations have been found to be associated with cancers located in the right-hand (proximal) side of the colon. Thus, the methods of the present invention contemplate utilizing a combinatorial testing approach to screen patients, wherein BAT-26 testing is used to screen the right side of the colon, and flexible sigmoidoscopy is utilized to screen the left hand (distal/lower) side of the colon. Such a testing methodology permits a far more thorough screen for cancerous and/or precancerous lesions than was previously possible using tests practiced in the art. Thus, in another embodiment, the present invention provides methods for detecting the presence of colorectal cancerous or precancerous lesions comprising (i) conducting in a sample obtained non-invasively or minimally-invasively from a patient an assay to identify a BAT-26 marker in the sample, and (ii) performing a flexible sigmoidoscopy on the patient.
The methods of the invention are useful for detecting diseases or disorders related to the colon including, but not limited to, cancer, pre-cancer and other diseases or disorders such as adenoma, polyp, inflammatory bowel disorder, inflammatory bowel syndrome, regional enteritis, granulomatous ileitis granulomatous ileocolitis, Crohn""s Disease, ileitis, ileocolitis, jejunoileitis, granulomatous colitis, Yersinia enterocolitica enteritis, ulcerative colitis, psuedo-membraneous colitis, irritable bowel syndrome, diverticulosis, diverticulitis, intestinal parasites, infectious gastroenteritis, toxic gastroenteritis, and bacterial gastroenteritis.
The methods of the present invention also provide for the use of BAT-26 as a marker for detection of cancerous and precancerous lesions by analysis of heterogeneous samples (e.g, stool). Such methods comprise obtaining a representative sample of a stool voided by a patient and performing an assay on the sample to identify a BAT-26 marker in the sample.
In another preferred embodiment, methods of the invention comprise selecting one or more mutational events that are indicative of cancer, precancer, or other diseases or disorders, such that the combined informativeness of the one or more events meets or exceeds a predetermined or desired level of informativeness. The informativeness of any mutation or combination of mutations may be validated by an accepted invasive screening technique. For example, in methods to detect colorectal cancer, the informativeness of a molecular assay may be determined by identification of a lesion using colonoscopy.
A detailed description of certain preferred embodiments of the invention is provided below. Other embodiments of the invention are apparent upon review of the detailed description that follows.