Ranked as the third most commonly diagnosed cancer and the second leading cause of cancer deaths in the United States (American Cancer Society, “Cancer facts and figures,” Washington, D.C.: American Cancer Society (2000)), colon cancer is a deadly disease afflicting nearly 130,000 new patients yearly in the United States. Colon cancer is the only cancer that occurs with approximately equal frequency in men and women. There are several potential risk factors for the development of colon and/or rectal cancer. Known factors for the disease include older age, excessive alcohol consumption, sedentary lifestyle (Reddy, Cancer Res., 41:3700-3705 (1981)), and genetic predisposition (Potter J Natl Cancer Institute, 91:916-932 (1999)).
Several molecular pathways have been linked to the development of colon cancer (see, for example, Leeman et al., J Pathol., 201(4):528-34 (2003); Kanazawa et al., Tumori., 89(4):408-11 (2003); and Notarnicola et al., Oncol Rep., 10(6): 1987-91 (2003)), and the expression of key genes in any of these pathways may be affected by inherited or acquired mutation or by hypermethylation. A great deal of research has been performed with regard to identifying genes for which changes in expression may provide an early indicator of colon cancer or a predisposition for the development of colon cancer. Unfortunately, no research has yet been conducted on identifying specific genes associated with colorectal cancer and specific outcomes to provide an accurate prediction of prognosis.
Survival of patients with colon and/or rectal cancer depends to a large extent on the stage of the disease at diagnosis. Devised nearly seventy years ago (Dukes, 1932, J Pathol Bacteriol 35:323), the modified Dukes' staging system for colon cancer, discriminates four stages (A, B, C, and D), primarily based on clinicopathologic features such as the presence or absence of lymph node or distant metastases. Specifically, colonic tumors are classified by four Dukes' stages: A, tumor within the intestinal mucosa; B, tumor into muscularis mucosa; C, metastasis to lymph nodes and D, metastasis to other tissues. Of the systems available, the Dukes' staging system, based on the pathological spread of disease through the bowel wall, to lymph nodes, and to distant organ sites such as the liver, has remained the most popular. Despite providing only a relative estimate for cure for any individual patient, the Dukes' staging system remains the standard for predicting colon cancer prognosis, and is the primary means for directing adjuvant therapy.
The Dukes' staging system, however, has only been found useful in predicting the behavior of a population of patients, rather than an individual. For this reason, any patient with a Dukes A, B, or C lesion would be predicted to be alive at 36 months while a patient staged as Dukes D would be predicted to be dead. Unfortunately, application of this staging system results in the potential over-treatment or under-treatment of a significant number of patients. Further, Dukes' staging can only be applied after complete surgical resection rather than after a pre-surgical biopsy.
DNA array technologies have made it possible to monitor the expression level of a large number of genetic transcripts at any one time (see, e.g., Schena et al., 1995, Science 270:467-470; Lockhart et al., 1996, Nature Biotechnology 14:1675-1680; Blanchard et al., 1996, Nature Biotechnology 14:1649; Ashby et al., U.S. Pat. No. 5,569,588, issued Oct. 29, 1996). Of the two main formats of DNA arrays, spotted cDNA arrays are prepared by depositing PCR products of cDNA fragments with sizes ranging from about 0.6 to 2.4 kb, from full length cDNAs, ESTs, etc., onto a suitable surface (see, e.g., DeRisi et al., 1996, Nature Genetics 14:457-460; Shalon et al., 1996, Genome Res. 6:689-645; Schena et al., 1995, Proc. Natl. Acad. Sci. U.S.A. 93:10539-11286; and Duggan et al., Nature Genetics Supplement 21:10-14). Alternatively, high-density oligonucleotide arrays containing thousands of oligonucleotides complementary to defined sequences, at defined locations on a surface are synthesized in situ on the surface by, for example, photolithographic techniques (see, e.g., Fodor et al., 1991, Science 251:767-773; Pease et al., 1994, Proc. Natl. Acad. Sci. U.S.A. 91:5022-5026; Lockhart et al., 1996, Nature Biotechnology 14:1675; McGall et al., 1996, Proc. Natl. Acad. Sci U.S.A. 93:13555-13560; U.S. Pat. Nos. 5,578,832; 5,556,752; 5,510,270; and 6,040,138). Methods for generating arrays using inkjet technology for in situ oligonucleotide synthesis are also known in the art (see, e.g., Blanchard, International Patent Publication WO 98/41531, published Sep. 24, 1998; Blanchard et al., 1996, Biosensors and Bioelectronics 11:687-690; Blanchard, 1998, in Synthetic DNA Arrays in Genetic Engineering, Vol. 20, J. K. Setlow, Ed., Plenum Press, New York at pages 111-123).
By simultaneously monitoring tens of thousands of genes, microarrays have permitted identification of biomarkers of cancer (Welsh et al., PNAS, 100(6):3410-3415 (March 2003)), creating gene expression-based classifications of cancers (Alzadeh et al., Nature, 403:513-11 (2000); and Garber et al., Proc Natl Acad Sci USA, 98:13784-9 (2001); development of gene based multi-organ cancer classifiers (Bloom et al, Am J Pathol 164:9-16, 2004; Giordano et al., Am J Pathol, 159:1231-8 (2001); Ramaswamy et al., Proc Natl Acad Sci USA, 98:15149-54 (2001); and Su et al., Cancer Res, 61:7388-93 (2001)), identification of tumor subclasses (Dyrskjot et al., Nat Genet, 33:90-6 (2003); Bhattacharjee et al., Proc Natl Acad Sci USA, 98:13790-5 (2001); Garber et al., Proc Natl Acad Sci USA, 98:13784-9. (2001); and Sorlie et al., Proc Natl Acad Sci USA, 98:10869-74 (2001)), discovery of progression markers (Sanchez-Carbayo et al., Am J Pathol, 163:505-16 (2003); and Frederiksen et al., J Cancer Res Clin Oncol, 129:263-71 (2003)); and prediction of disease outcome (Henshall et al., Cancer Res, 63:4196-203 (2003); Shipp et al., Nat Med, 8:68-74 (2002); Beer et al., Nat Med, 8:816-24 (2002); Pomeroy et al., Nature, 415:436-42 (2002); van't Veer et al., Nature, 415:530-6 (2002); Vasselli et al., Proc Natl Acad Sci USA, 100:6958-63 (2003); Takahashi et al., Proc Natl Acad Sci USA, 98:9754-9 (2001); WO 2004/065545 A2; WO 02/103320 A2)); and in drug discovery (Marton et al., Nat Med, 4(11):1293-301 (1998); and Gray et al., Science, 281:533-538 (1998)).
One tool that has been applied to microarrays to decipher and compare genome expression patterns in biological systems is Significance Analysis of Microarrays, or SAM (Tusher et al., 2001, Proc. Natl. Acad. Sci. 98:5116-5121). This statistical method was developed as a cluster tool for use in identifying genes with statistically significant changes in expression. SAM has been used for a variety of purposes, including identifying potential drugs that would be effective in treating various conditions associated with specific gene expressions (Bunney et al., Am J Psychiatry, 160(4):657-66 (April 2003)).
Sophisticated and powerful machine learning algorithms have been applied to transcriptional profiling analysis. For example, a modified “Fisher classification” approach has been applied to distinguish patients with good prognosis from those who do not have a good prognosis, based on their expression profiles (van't Veer et al., 2002, Nature 415: 530-6). A similar study has been reported using an artificial neural network (Bloom et al, Am J Pathol 164:9-16, 2004; Khan et al., 2001, Nat Med 7: 673-9). Support Vector Machine (SVM) (see, e.g., Brown et al., Proc. Natl. Acad. Sci. 97(1):262-67 (2000); Zien et al., Bioinformatics, 16(9):799-807 (2000); Furey et al., Bioinformatics, 16(10):906-914 (2000)) is a correlation tool shown to perform well in multiple areas of biological analysis, including evaluating microarray expression data (Brown et al, Proc Natl Acad Sci USA, 97:262-267 (2000)), detecting remote protein homologies (Jaakkola et al., Proceedings of the 7th International Conference on Intelligent Systems for Molecular Biology, AAAI Press, Menlo Park, Calif. (1999)), and classification of cancer tissues (Furey et al., Bioinformatics, 16(10):906-914 (2000)). Furey describes using SVM to classify colon cancer tissues based on expression levels of a set of 2000 genes or a set of 1000 genes having the highest minimal intensity across 60 colon tissue samples (40 tumors and 22 normal tissues) on an Affymetrix® oligonucleotide microarray.
Wang et al. (Wang et al., 2004, J. Clinical Oncology 22:1564-1571) reported identification of a 60-gene and a 23-gene signature for prediction of cancer recurrence in Dukes' B patients using an Affymetrix® U133a GeneChip. This signature was validated in 36 independent patients. Two supervised class prediction approaches were used to identify gene markers that could best discriminate between patients who would experience relapse and patients who would remain disease-free. A multivariate Cox model was built to predict recurrence. The overall performance accuracy was reported as 78%.
Resnick et al. (Resnick et al., 2004, Clin. Can. Res. 10:3069-3075) reported a study of the prognostic value of epidermal growth factor receptor, c-MET, b-catenin, and p53 protein expression in TNM stage II colon cancer using tissue microarray technology.
Muro et al. (Muro et al., 2003, Genome Biology 4:R21) describes identification and analysis of the expression levels of 1,536 genes in colorectal cancer and normal tissues using a parametric clustering method. Three groups of genes were discovered. Some of the genes were shown to not only correlate with the differences between tumor and normal tissues but also the presence and absence of distant metastasis.
U.S. Patent Application Publication No. 2005/0048542A1, published on Mar. 3, 2005, describes a noninvasive, quantitative test for prognosis determination in cancer patients. The test makes use of measurements of the tumor levels of certain messenger RNAs (mRNAs). These mRNA levels are inserted into a polynomial formula (algorithm) that yields a numerical recurrence score, which indicates recurrence risk.
Discussion or citation of a reference herein shall not be construed as an admission that such reference is prior art to the present invention.