One of the greatest challenges in the management of Barrett's esophagus (BA), the precursor lesion of esophageal adenocarcinoma (EAC), is to expeditiously identify patients who have early EAC and to predict those who will develop EAC. The rate of progression to cancer (0.4-0.5% per year in some studies, 0.5 to 1% per year in other studies) is very low, making this challenge particularly difficult (Reynolds et al., Gastroenterol Clin North Am 28(4):917-45 (1999); Cameron, Gastroenterol Clin North Am 26(3):487-94 (1997)). Moreover, in the surveillance of BA, a meticulous endoscopic search is often performed to identify grossly normal-appearing dysplastic or cancerous lesions. However, the value of this type of systematic surveillance has been questioned, due to its low sensitivity and specificity (Conio et al., Am J Gastroenterol 98(9):1931-9 (2003)). Thus, from a purely practical standpoint, it would be advantageous to be able to identify patients with malignant esophageal lesions simply by biopsying their normal squamous esophagus.
The presence and degree of dysplasia constitute the most widely accepted measure of neoplastic risk in Barrett's esophagus. However, significant problems have emerged demonstrating the need for improved progression risk biomarkers. These problems include poor interobserver reproducibility of dysplasia interpretation and inconsistent rates of progression as well as regression of dysplasia, both of which have made it difficult to develop national surveillance guidelines (Conio et al., Am J Gastroenterol 98(9):1931-9 (2003); Rana et al., Dis Esophagus 13(1):28-31 (2000); Reid et al., Am J Gastroenterol 95(7):1669-76 (2000)). Flow cytometry has shown promise in detecting a subset of patients who do not have high-grade dysplasia (HGD) but do have an increased risk of progression (Reid et al., Am J Gastroenterol 95(7):1669-76 (2000)).
The human genome project has yielded high-throughput methodologies for the computer analysis of data, which provide volume and quality control required to select clinically useful biomarkers (Taramelli et al., Eur J Cancer 40(17):2537-43 (2004); Varmus et al., Science 310(5754):1615 (2005); Yoshida, Jpn J Clin Oncol 29(10):457-9 (1999)). 17p (p53)-loss of heterozygosity (LOH) has also shown potential as a molecular biomarker (Reid et al., Gastrointest Endosc Clin N Am 13(2):369-97 (2003)). In addition, methylation of p16 and HPP1 have been shown to predict progression to HGD and EAC (Hardie et al., Cancer Lett 217(2):221-30 (2005); Geddert et al., Int J Cancer 110(2):208-11 (2004); Schulmann et al., Oncogene 24(25):4138-48 (2005)). Molecular alterations have been found in Barrett's metaplasia which reveal a field effect in premalignant metaplastic mucosa, but not in normal epithelium. For example, aneuploidy and loss of heterozygosity have been observed in metaplastic mucosa from Barrett's patients with dysplasia or adenocarcinoma (Blount et al., Proc Natl Acad Sci USA 90(8):3221-5 (1993); Boynton et al., Cancer Res 51(20):5766-9 (1991); Raskind et al., Cancer Res 52(10):2946-50 (1992); Reid et al., Gastroenterology 93(1):1-11 (1987)). Similarly, p53 tumor suppressor gene point mutations have been reported in Barrett's metaplasia (Casson et al., Am J Surg 167(1):52-7 (1994); Huang et al., Cancer Res 53(8):1889-94 (1993); Meltzer et al., Proc Natl Acad Sci USA 88(11):4976-80 (1991)), and altered promoter DNA methylation has also been described for some tumor suppressor genes in Barrett's esophagus (Eads et al., Cancer Res 60(18):5021-6 (2000); Kawakami et al., J Natl Cancer Inst 92(22):1805-11 (2000); Klump et al., Gastroenterology 115:1381-6 (1998); Wong et al., Cancer Res 57(13):2619-22 (1997)).
In contrast, most published studies to date report no DNA alterations (e.g., point mutations, methylation, or loss of heterozygosity) in normal squamous esophageal epithelium from patients with esophageal cancer. Corn et al. (Clinical Cancer Research 7(9):2765-9 (2001)) reported E-cadherin methylation in Barrett's esophagus specimens and esophageal adenocarcinoma, but not in normal esophageal epithelium. Another study showed that the expression of a panel of 23 genes capable of differentiating between Barrett's esophagus and esophageal adenocarcinoma was unable to distinguish between the normal epithelia of Barrett's metaplasia and adenocarcinoma patients (Brabender et al., Oncogene 23(27):4780-8 (2004)). One notable exception was the study by Eads et al., which found methylation of the CALCA, MGMT, and TIMP3 genes in the normal esophagus of a subset of patients with Barrett's-associated esophageal dysplasia and adenocarcinoma (Eads et al., Cancer Res 61(8):3410-8 (2001).
cDNA microarrays promise more accurate prediction than do classical clinical diagnostic tools (such as histologic categorization). However, the main challenge posed by microarrays is to construct meaningful classifiers based on gene expression profiles, using appropriate bioinformatics tools. A number of bioinformatics tools have been proposed, including artificial neural networks (Selaru et al., Gastroenterology 122(3):606-13 (2002)), hierarchical clustering (Selaru et al., Oncogene 21(3):475-8 (2002); Zou et al., Oncogene 21(31):4855-62 (2002)) and principal components analysis (Mori et al., Cancer Res 63(15):4577-82 (2003); Selaru et al., Cancer Res 64:1584-88 (2004)). Shrunken nearest centroid predictors (SNCPs) were adapted from classical nearest centroids predictors to gene microarray analysis (Tibshirani et al., Proc Natl Acad Sci USA 99(10):6567-72 (2002)). From among large numbers of genes, it is difficult to distinguish expression variations due to chance. However, these variations tend to be of small amplitude. Thus, if small variations are ignored and only consistently relatively high changes in expression are accepted, biologic changes prevail over variations due to chance. Among the mathematical means used to ignore small variations, one method, SNCPs, is particularly valuable. Prediction Analysis of Microarrays (PAM) is a software package developed at Stanford University that utilizes SNCPs and performs internal validation simultaneously. Samples are divided up at random into K roughly equal-sized parts. For each part in turn, the classifier is built on the remaining K-1 parts, then tested on the last 1 part. This procedure is performed over a range of threshold values, and the cross-validated misclassification error rate is reported for each threshold value. Typically, the user chooses the threshold value giving the minimum cross-validated misclassification error rate. This method has been utilized successfully by investigators studying leukemia and breast cancer to find subsets of genes that accurately predicted classifications of these diseases (Tibshirani et al., Proc Natl Acad Sci USA 99(10):6567-72 (2002); Korkola et al., Cancer Res 63(21):7167-75 (2003); Sorlie et al., Proc Natl Acad Sci USA 100(14):8418-23 (2003)).