1. Field of the Invention The present invention relates to genomic DNA sequences that exhibit altered expression patterns in disease states relative to normal. Particular embodiments provide, inter alia, novel methods, nucleic acids, nucleic acid arrays and kits useful for detecting, or for detecting and differentiating between or among cell proliferative disorders. Preferably, the methods, nucleic acids, nucleic acid arrays and kits for the detection and diagnosis of cell proliferative disorders are used for the diagnosis of cancer and in particular colorectal and/or liver cancer.
2. Background Information
Incidence and Diagnosis of Cancer.
Cancer is the second leading cause of death of the United States. Mortality rates could be significantly improved if current screening methods would be improved in terms of patient compliance, sensitivity and ease of screening. Current recommended methods for diagnosis of cancer are often expensive and are not suitable for application as population wide screening tests.
Hepatocellular cancer (HCC) is the fourth most common cancer in the world, its incidence varies from 2.1 per 100,000 in North America to 80 per 100,000 in China. In the United States, it is estimated that there will be 17,550 new cases diagnosed in 2005 and 15,420 deaths due to this disease. Ultrasound of the liver, alpha fetoprotein levels and conventional CT scan are regularly obtained in the diagnostic evaluation of HCC (hepatocellular cancer or primary liver cancer), but they are often too insensitive to detect multi-focal small lesions and for treatment planning.
In the United States the annual incidence of colorectal cancer is approximately 150,000, with 56,600 individuals dying form colorectal cancer each year. The lifetime risk of colorectal cancer in the general population is about 5 to 6 percent. Despite intensive efforts in recent years in screening and early detection of colon cancer, until today most cases are diagnosed in an advanced stage with regional or distant metastasis. While the therapeutic options include surgery and adjuvant or palliative chemotherapy, most patients die from progression of their cancer within a few months. Identifying the molecular changes that underlie the development of colon cancer may help to develop new monitoring, screening, diagnostic and therapeutic options that could improve the overall poor prognosis of these patients.
The current guidelines for colorectal screening according to the American Cancer Society utilizes one of five different options for screening in average risk individuals 50 years of age or older. These options include 1) fecal occult blood test (FOBT) annually, 2) flexible sigmoidoscopy every five years, 3) annual FPBT plus flexible sigmoidoscopy every five years, 4) double contrast barium enema (DCBE) every five years or 5) colonoscopy every ten years. Even though these testing procedures are well accepted by the medical community, the implementation of widespread screening for colorectal cancer has not been realized. Patient compliance is a major factor for limited use due to the discomfort or inconvenience associated with the procedures. FOBT testing, although a non-invasive procedure, requires dietary and other restrictions 3-5 days prior to testing. Sensitivity levels for this test are also very low for colorectal adenocarcinoma with wide variability depending on the trial. Sensitivity measurements for detection of adenomas is even less since most adenomas do not bleed. In contrast, sensitivity for more invasive procedures such as sigmoidoscopy and colonoscopy are quite high because of direct visualization of the lumen of the colon. No randomized trials have evaluated the efficacy of these techniques, however, using data from case-control studies and data from the National Polyp Study (U.S.) it has been shown that removal of adenomatous polyps results in a 76-90% reduction in CRC incidence. Sigmoidoscopy has the limitation of only visualizing the left side of the colon leaving lesions in the right colon undetected. Both scoping procedures are expensive, require cathartic preparation and have increased risk of morbidity and mortality. Improved tests with increased sensitivity, specificity, ease of use and decreased costs are clearly needed before general widespread screening for colorectal cancer becomes routine.
Early colorectal cancer detection is generally based on the fecal occult blood test (FOBT) performed annually on asymptomatic individuals. Current recommendations adapted by several healthcare organizations, including the American Cancer Society, call for fecal occult blood testing beginning at age 50, repeated annually until such time as the patient would no longer benefit from screening. A positive FOBT leads to colonoscopic examination of the bowel; an expensive and invasive procedure, with a serious complication rate of one per 5,000 examinations. Only 12% of patients with heme positive stool are diagnosed with cancer or large polyps at the time of colonoscopy. A number of studies show that FOBT screening does not improve cancer-related mortality or overall survival. Compliance with occult blood testing has been poor; less than 20 percent of the population is offered or completes FOBT as recommended. If FOBT is properly done, the patient collects a fecal sample from three consecutive bowel movements. Samples are obtained while the patient adheres to dietary guidelines and avoids medications known to induce occult gastrointestinal bleeding. In reality, physicians frequently fail to instruct patients properly, patients frequently fail to adhere to protocol, and some patients find the task of collecting fecal samples difficult or unpleasant, hence compliance with annual occult blood testing is poor. If testing sensitivity and specificity can be improved over current methods, the frequency of testing could be reduced, collection of consecutive samples would be eliminated, dietary and medication schedule modifications would be eliminated, and patient compliance would be enhanced. Compounding the problem of compliance, the sensitivity and specificity of FOBT to detect colon cancer is poor. Poor test specificity leads to unnecessary colonoscopy, adding considerable expense to colon cancer screening.
Specificity of the FOBT has been calculated at best to be 96%, with a sensitivity of 43% (adenomas) and 50% (colorectal carcinoma). Sensitivity can be improved using an immunoassay FOBT such as that produced under the tradename InSure®, with an improved sensitivity of 77% (adenomas) and 88.9% (colorectal carcinoma.
Molecular Disease Markers.
Molecular disease markers offer several advantages over other types of markers, one advantage being that even samples of very small sizes and/or samples whose tissue architecture has not been maintained can be analysed quite efficiently. Within the last decade a number of genes have been shown to be differentially expressed between normal and colon carcinomas. However, no single or combination of marker has been shown to be sufficient for the diagnosis of colon carcinomas. High-dimensional mRNA based approaches have recently been shown to be able to provide a better means to distinguish between different tumor types and benign and malignant lesions. However its application as a routine diagnostic tool in a clinical environment is impeded by the extreme instability of mRNA, the rapidly occurring expression changes following certain triggers (e.g., sample collection), and, most importantly, the large amount of mRNA needed for analysis (Lipshutz, R. J. et al., Nature Genetics 21:20-24, 1999; Bowtell, D. D. L. Nature genetics suppl. 21:25-32, 1999), which often cannot be obtained from a routine biopsy.
The use of biological markers to further improve sensitivity and specificity of FOBT has been suggested, examples of such tests include the PreGen-Plus™ stool analysis assay available from EXACT Sciences which has a sensitivity of 20% (adenoma) and 52% (colorectal carcinoma) and a specificity of 95% in both cases. This test assays for the presence of 23 DNA mutations associated with the development of colon neoplasms. The use of DNA methylation as colon cancer markers is known. For example Sabbioni et al. (Molecular Diagnosis 7:201-207, 2003) detected hypermethylation of a panel of genes consisiting TPEF, HIC1, DAPK and MGMT in peripheral blood in 98% of colon carcinoma patients. However, this does provide a suitable basis for a commercially marketable test, as the specificity of such a test must also be sufficiently high.
The current model of colorectal pathogenesis favours a stepwise progression of adenomas, which includes the development of dysplasia and finally signs of invasive cancer. The molecular changes underlying this adenoma-carcinoma sequence include genetic and epigenetic alterations of tumor suppressor genes (APC, p53, DCC), the activation of oncogenes (K-ras) and the inactivation of DNA mismatch repair genes. Recently, further molecular changes and genetic defects have been revealed. Thus, activation of the Wnt signalling pathway not only includes mutations of the APC gene, but may also result from β-catenin mutations. Furthermore, alterations in the TGF-β signalling pathway together with its signal transducers SMAD4 and SMAD2 have been linked to the development of colon cancer.
Despite recent progress in the understanding of the pathogenesis of adenomas and carcinomas of the colon and their genetic and molecular changes, the genetic and epigenetic changes underlying the development of metastasis are less well understood. It is, however, generally well accepted that the process of invasion and proteolysis of the extracellular matrix, as well as infiltration of the vascular basement membrane involve adhesive proteins, such as members of the family of integrin receptors, the cadherins, the immunoglobulin superfamily, the laminin binding protein and the CD44 receptor. Apart from adhesion, the process of metastasis formation also includes the induction and regulation of angiogenesis (VEGF, bFGF), the induction of cell proliferation (EGF, HGF, IGF) and the activation of proteolytic enzymes (MMPs, TIMPs, uPAR), as well as the inhibition of apoptosis (Bcl-2, Bcl-X). More recently other groups have compared the genetic and molecular changes in metastatic lesions to the changes found in primary colorectal cancers. Thus, Kleeff et al. reported the loss of DOC-2, a candidate tumor suppressor gene, both in primary and metastatic colorectal cancer. Furthermore, Zauber et al. reported that in their series of 42 colorectal cancers Ki-ras mutations in the primary cancers were identical in all of the 42 paired primary and synchronous metastatic lesions. Similarly loss of heterozygosity at the APC locus was identical for 39 paired carcinomas and synchronous metastasis. The authors concluded that for Ki-ras and APC genes the genetic changes in metastasis are identical to the primary colorectal cancer. However, other groups have found genetic and molecular changes in metastatic colon cancers, that are not present in the primary cancers. Thus, the development of LOH of chromosome 3p in colorectal metastasis has been reported. In addition, using comparative genomic hybridization several alterations were found in liver metastasis that were unique to metastastic lesions (−9q, −11q, and −17q).
CpG Island Methylation.
Apart from mutations aberrant methylation of CpG islands has been shown to lead to the transcriptional silencing of certain genes that have been previously linked to the pathogenesis of various cancers. CpG islands are short sequences which are rich in CpG dinucleotides and can usually be found in the 5′ region of approximately 50% of all human genes. Methylation of the cytosines in these islands leads to the loss of gene expression and has been reported in the inactivation of the X chromosome and genomic imprinting.
Recently several groups have also analysed the methylation of various genes in colorectal cancer and reported the transcriptional silencing by promoter methylation for p16INK4, p14ARF, p15INK4b, MGMT, hMLH1, GSTP1, DAPK, CDH1, TIMP-3 and APC among others. Thus apart from mutational inactivation of certain genes, the hypermethylation of these genes also contributes significantly to the pathogenesis of this disease.
In recent years several genes that are methylated in colon cancer have been identified by MS-APPCR. This group of genes, among others, includes TPEF/HPP1 which is frequently methylated in colon cancers and which was independently identified by two different groups using the MS-APPCR method (see, e.g., Young J, Biden K G, Simms L A, Huggard P, Karamatic R, Eyre H J, Sutherland G R, Herath N, Barker M, Anderson G J, Fitzpatrick D R, Ramm G A, Jass J R, Leggett B A. HPP1: a transmembrane protein-encoding gene commonly methylated in colorectal polyps and cancers. Proc Natl Acad Sci USA 98:265-270, 2001).
Multifactorial Approach.
Cancer diagnostics has traditionally relied upon the detection of single molecular markers (e.g., gene mutations, elevated PSA levels). Unfortunately, cancer is a disease state in which single markers have typically failed to detect or differentiate many forms of the disease. Thus, assays that recognize only a single marker have been shown to be of limited predictive value. A fundamental aspect of this invention is that methylation-based cancer diagnostics and the screening, diagnosis, and therapeutic monitoring of such diseases will provide significant improvements over the state-of-the-art that uses single marker analyses by the use of a selection of multiple markers. The multiplexed analytical approach is particularly well suited for cancer diagnostics since cancer is not a simple disease, this multi-factorial “panel” approach is consistent with the heterogeneous nature of cancer, both cytologically and clinically.
Key to the successful implementation of a panel approach to methylation based diagnostic tests is the design and development of optimized panels of markers that can characterize and distinguish disease states. The present invention describes a plurality of particularly efficient and unique panels of genes, the methylation analysis of one or a combination of the members of the panel enabling the detection of colon cell proliferative disorders with a particularly high sensitivity, specificity and/or predictive value.
Development of Medical Tests.
Two key evaluative measures of any medical screening or diagnostic test are its sensitivity and specificity, which measure how well the test performs to accurately detect all affected individuals without exception, and without falsely including individuals who do not have the target disease (predicitive value). Historically, many diagnostic tests have been criticized due to poor sensitivity and specificity.
A true positive (TP) result is where the test is positive and the condition is present. A false positive (FP) result is where the test is positive but the condition is not present. A true negative (TN) result is where the test is negative and the condition is not present. A false negative (FN) result is where the test is negative but the condition is not present. In this context: Sensitivity=TP/(TP+FN); Specificity=TN/(FP+TN); and Predictive value=TP/(TP+FP).
Sensitivity is a measure of a test's ability to correctly detect the target disease in an individual being tested. A test having poor sensitivity produces a high rate of false negatives, i.e., individuals who have the disease but are falsely identified as being free of that particular disease. The potential danger of a false negative is that the diseased individual will remain undiagnosed and untreated for some period of time, during which the disease may progress to a later stage wherein treatments, if any, may be less effective. An example of a test that has low sensitivity is a protein-based blood test for HIV. This type of test exhibits poor sensitivity because it fails to detect the presence of the virus until the disease is well established and the virus has invaded the bloodstream in substantial numbers. In contrast, an example of a test that has high sensitivity is viral-load detection using the polymerase chain reaction (PCR). High sensitivity is achieved because this type of test can detect very small quantities of the virus. High sensitivity is particularly important when the consequences of missing a diagnosis are high.
Specificity, on the other hand, is a measure of a test's ability to identify accurately patients who are free of the disease state. A test having poor specificity produces a high rate of false positives, i.e., individuals who are falsely identified as having the disease. A drawback of false positives is that they force patients to undergo unnecessary medical procedures treatments with their attendant risks, emotional and financial stresses, and which could have adverse effects on the patient's health. A feature of diseases which makes it difficult to develop diagnostic tests with high specificity is that disease mechanisms, particularly in cancer, often involve a plurality of genes and proteins. Additionally, certain proteins may be elevated for reasons unrelated to a disease state. n example of a test that has high specificity is a gene-based test that can detect a p53 mutation. Specificity is important when the cost or risk associated with further diagnostic procedures or further medical intervention are very high.
Pronounced Need in the Art.
It is generally accepted that there is a pronounced need in the art for improved screening and early detection of cancers. As an example, if colon cancer screening specificity can be increased, the problem of false positive test results leading to unnecessary colonoscopic examination would be reduced leading to cost savings and improved safety.
In view of the incidence of cancers in general and more particularly the disadvantages associated with current colorectal and hepatocelluar cell proliferative disorder screening methods there is a substantial need in the art for improved methods for the early detection of cancer, in particular colon cancer, to be used in addition to or as a substitute for currently available tests.
Background of the Genes of the Present Invention.
The human Septin 9 gene (also known as MLL septin-like fusion protein, MLL septin-like fusion protein MSF-A, Slpa, Eseptin, Msf, septin-like protein Ovarian/Breast septin (Ov/Br septin) and Septin D1) is located on chromosome 17q25 within contig AC068594.15.1.168501 and is a member of the Septin gene family. FIG. 1 provides the Ensembl annotation of the Septin 9 gene, and shows 4 transcript variants, the Septin 9 variants and the Q9HC74 variants (which are truncated versions of the Septin 9 transcripts). SEQ ID NO:1 provides the sequence of said gene, comprising regions of both the Septin 9 and Q9HC74 transcripts and promoter regions. SEQ ID NO:2 and SEQ ID NO:3 are sub-regions thereof that provide the sequence of CpG rich promoter regions of Septin 9 and Q9HC74 transcripts respectively.
It has been postulated that members of the Septin gene family are associated with multiple cellular functions ranging from vesicle transport to cytokinesis. Disruption of the action of Septin 9 results in incomplete cell division, see Surka, M. C., Tsang, C. W., and Trimble, W. S. Mol Biol Cell, 13: 3532-45 (2002). Septin 9 and other proteins have been shown to be fusion partners of the proto-oncogene MLL suggesting a role in tumorogenesis, see Osaka, M, Rowley, J. D. and Zeleznik-Le, N. J. PNAS, 96:6428-6433 (1999). Burrows et al. reported an in depth study of expression of the multiple isoforms of the Septin 9 gene in ovarian cancer and showed tissue specific expression of various transcripts, see Burrows, J. F., Chanduloy, et al. S.E.H. Journal of Pathology, 201:581-588 (2003).
A recent study (post-priority date published prior art) of over 7000 normal and tumor tissues indicates that there is consistent over-expression of Septin 9 isoforms in a number of tumor tissues, see Scott, M., Hyland, P. L., et al. Oncogene, 24: 4688-4700 (2005). The authors speculate that the gene is likely a type II cancer gene where changes in RNA transcript processing control regulation of different protein products, and the levels of these altered protein isoforms may provide answers to the gene's role in malignancy.
The MSF (migration stimulating factor) protein transcribed from the FN1 gene has also been implicated in carcinogenesis (see WO99/31233); however, it should be noted that this protein is not the subject of the present application, and is currently not known to be associated with the Septin 9/MSF gene and transcribed products thereof.
From the references cited above it can be seen that the biological mechanisms linking said gene to tumorigenesis remain unclear. In WO 2004/074441 it is claimed that increased copy number and over-expression of the gene is a marker of cancer, and further provides means for diagnosis and treatment thereof according to said observation. WO 2004/074441 is accordingly the closest prior art as it has the greatest number of features in common with the method and nucleic acids of the present invention, and because it relates to the same field (cancer diagnosis). A major difference between the present invention and that of WO 2004/074441 is that the present invention shows for the first time that under-expression of the gene Septin 9 is associated with cancer. More particularly this is illustrated by means of methylation analysis. The correlation between expression and DNA methylation, and methods for determining DNA methylation are known in the art (see WO 99/28498). Nonetheless, it would not be obvious to the person skilled in the art that under-expression would be also associated with the development of cancer, in particular as WO 2004/074441 describes the modulation of said expression to lower levels as a potential therapy for cancer.
SEQ ID NO:28 provides a CpG rich sequence located on chromosome 17q in the overlapping promotor regions of the Vitronectin (VTN, OMIM 193190, Accession number NM—000638) and SARM genes (Steril Alpha And Heat/Armadillo Motifs-Containing Protein, OMIM 607732).
The VTN gene encodes a 75-kD glycoprotein (also called serum spreading factor or complement S-protein) that promotes attachment and spreading of animal cells in vitro, inhibits cytolysis by the complement C5b-9 complex, and modulates antithrombin III-thrombin action in blood coagulation. Higher expression of Vitronectin has been observed in colon cancer cells (Exp Cell Res. 1994 September; 214(1):303-12.). Furthermore, expression of this gene has been linked to progression and invasiveness of cancer cells. It is suggest that VTN is activated in tumours and blocking of the vitronectin receptor by a specific peptide reduces tumour size (Bloemendal H J, de Boer H C, Koop E A, van Dongen A J, Goldschmeding R, Landman W J, Logtenberg T, Gebbink M F, Voest E E. Cancer Immunol Immunother. 2004 September; 53(9):799-808.; Haier J, Goldmann U, Hotz B, Runkel N, Keilholz U. Clin Exp Metastasis. 2002; 19(8):665-72.).
The SARM protein consists of 690 amino acids and contains a 65-amino acid sterile alpha (SAM) domain surrounded by short HEAT/armadillo repeat sequences. Northern blot analysis has shown that the SARM antisense RNA is detectable at elevated levels in cancer cell lines irrespective of the tissue origin or metastatic potential (Mink, M.; Fogelgren, B.; Olszewski, K.; Maroy, P.; Csiszar, K. Genomics 74: 234-244, 2001.). Said study further demonstrated that the protein-encoding SARM transcript was only expressed one prostate carcinoma cell line of those studied.
SEQ ID NO:24 provides a CpG rich sequence located on chromosome 3q23, downstream of the gene Forkhead Transcription Factor L2 gene (FOXL2, Pituitary Forkhead Factor, OMIM 605597). SEQ ID NO:24 has heretofore not been associated with cancer of any type. FOXL2 coding region is highly conserved among mammals. Immunohistochemical evidence indicates that FOXL2 is a nuclear protein specifically expressed in eyelids and in fetal and adult ovarian follicular cells. It is suggested that FOXL2 may play a role in ovarian somatic cell differentiation and in further follicle development and/or maintenance (J. Med. Genet. 39: 916-922, 2002.).
Moreover, mutations in the FOXL2 gene are associated with the blepharophimosis/ptosis/epicanthus inversus syndrome (BPES), a syndrome which affects the eyelids and the ovary (Am. J. Hum. Genet. 72: 478-487, 2003.; Hum. Mutat. 24: 189-193, 2004). The FOXL2 gene has heretofore not been associated with cancer, however other members of the FOX family have been implicated in oncogenesis. The FOXA1 gene is amplified and overexpressed in esophageal and lung cancer, the FOXM1 gene is up-regulated in pancreatic cancer and basal cell carcinoma due to the transcriptional regulation by Sonic Hedgehog (SHH) pathway.
SEQ ID NO:27 presents part of the Six6 (Homeobox protein SIX6) gene, located on Chromosome 14q, synonyms include Homeodomain protein OPTX2, Optic homeobox 2, OPTX2, Sine oculis homeobox homolog 6, sine oculis homeobox homolog 6 (Drosophila), Six9, SIX9. The Six6 gene is associated in development pathways and its aberrant expression has been linked T-cell acute lymphoblastic leukemia oncogenesis.
SEQ ID NO:25 is located on chromosome 17q21.31 and comprises the promotor region of the NGFR gene as well as parts of the NGFR gene itself (Nerve Growth Factor Receptor also known as p75, OMIM 162010). NGFR binds alone or in combination with other receptors to neurotrophins and neurite outgrowth inhibitory factors. It has been shown that NGFR plays a role in apoptosis, in myelination of peripheral nerves and in inhibition of central nervous system regeneration after axon lesion (Dobrowsky, R. T.; Werner, M. H.; Castellino, A. M.; Chao, M. V.; Hannun, Y. A. Science 265: 1596-1599, 1994; Cosgaya, J. M.; Chan, J. R.; Shooter, E. M. Science 298: 1245-1248, 2002; Wang, K. C.; Kim, J. A.; Sivasankaran, R.; Segal, R.; He, Z. Nature 420: 74-78, 2002.). Methylation of the NGFR gene has previously been associated with the development of colon cancer (PCT/US04/20336).
SEQ ID NO:26 is located on chromosome 17q21.31 and comprises the promoter region of the gene TMEFF2. Methylation of TMEFF2 has been linked to colon cancer Cancer Res. 2000 Sep. 1; 60(17):4907-12.