Carcinoma of unknown primary (CUP) is estimated to consist of about 3-5% of all metastatic cancers, with the American Cancer Society estimating in 2010 that there were 30,680 new cases of CUP and 44,030 deaths resulting from CUP. The diagnosis of CUP requires a biopsy-proven metastatic malignancy and no identifiable primary tumor after a thorough clinical evaluation. For cases which are designated as CUP after this evaluation, the source of these tumors is identified in between only about 20% to about 30% of the time ante mortem. The prognosis for patients in whom a primary site has not been identified is poor, with the median survival ranging from about 2 months to about 10 months. (Monzon F A, et al. Diagnosis of Metastatic Neoplasms. Arch Pathol Lab Med. 2010, 134:216-224).
Identifying site of primary origin for CUP remains a challenge for the pathologist, even with modern pathological techniques. This carries serious implications for cancer therapy, as current oncological therapeutic regimes are targeted to site of origin. Microarray based gene expression studies are one potential technological solution to this problem, and the feasibility of this methodology for broad-based tumor classification has been established by a number of studies. (Bloom, et al.; Multi-platform, multi-site, microarray-based human tumor classification, Am J Pathol 2004, 164:9-16; Bridgewater, et al., Gene expression profiling may improve diagnosis in patients with carcinoma of unknown primary, Br J Cancer 2008, 98:1425-1430; Buckhaults, et al., Identifying tumors origin using a gene expression-based classification map, Cancer Res 2003, 63:4144-4149. Giordano, et al., Organ-specific molecular classification of primary lung, colon, and ovarian adenocarcinomas using gene expression profiles, Am J Pathol 2001, 159:1231-1238; Ma, et al., Molecular classification of human cancers using a 92-gene real-time quantitative polymerase chain reaction assay. Arch Pathol Lab Med 2006, 130:465-473; Ramaswamy, et al., Multiclass cancer diagnosis using tumor gene expression signatures, Proc Natl Acad Sci USA 2001, 98:15149-15154; Su et al., Molecular classification of human carcinomas by use of gene expression signatures. Cancer Res 2001, 61:7399-7393) Approaches based entirely on gene expression data however, limit these studies, because they do not take into account well understood differences in morphology and biological differentiation. Pathologists recognize and exploit these differences in their daily practice.
The prior art in the area of diagnostic tests for determining site of primary origin of CUP fail to take into account differences in morphology and biological differentiation. Two tests are commercially available in the United States, the Pathwork Tissue of Origin Test (Pathwork Diagnostics, Sunnyvale, Calif.) and the THEROS CancerTYPe ID by bioTheranostics San Diego, Calif.). Both of these are mRNA-based products. The Pathwork Tissue of Origin Test issues a similarity score for 15 tumor types using a 1550-gene profile that uses the expression level of 1550 transcripts to perform pair-wise comparison between the test sample and each of the 15 tissues on the test panel. A validation study of this test was performed using 547 frozen specimens submitted from four institutions. The tissues were derived from either metastatic cancers or poorly of undifferentiated primary cancers. The test showed a sensitivity of 87.8% and a specificity of 99.4%. (Monzon F A et al. Multicenter validation of a 1,550-gene expression profile for identification of tumor tissue of origin. J Clin Oncol. 2009, 27:2503-2508) A limitation of this validation study is that it was performed using frozen tissues. This validation study is significant because it focused on poorly differentiated or undifferentiated primary carcinomas and metastatic carcinomas, which are the real challenges in tumor diagnosis. The Pathwork Tissue of Origin Test has now been developed for use in formalin-fixed, paraffin-embedded (FFPE) tissues as the PathChip. In a study of 462 FFPE specimens, the test demonstrated 89% positive percent agreement with available diagnoses, and greater than 99% negative percent agreement in specimens that had previously been identified with existing methods as being among the 15 tumor types on the panel. (Pillai R. et al. A microarray based gene expression test as an aid to tumor diagnosis using formalin-fixed paraffin-embedded (FFPE) specimens. Pathwork Diagnostics. Abstracts and Case Studies from the College of American Pathologists, 2009 Annual Meeting. Arch Pathol Lab Med 2009, 133:1608-1716). While identifying up to 15 tumor types, most may be distinguished with the application of simple ancillary studies, such as flow cytometry and gene rearrangement studies to diagnose non-Hodgkin lymphoma and immunohistochemistry to diagnose melanomas. Some of the recognized primaries, such as colorectal primaries and breast, have established immunohistochemical patterns. While this test may be helpful for the tumor types that do not have a well-defined immunohistochemical pattern or are poorly differentiated or undifferentiated, it does not report on differences in tumor morphology, such as squamous cell carcinoma versus adenocarcinoma versus neuroendocrine carcinoma. These features are more important in predicting cancer therapy and prognosis.
The THEROS CancerTYPE ID is designed to focus on those cases that are indeterminate and distinguishes among 39 tumor types. Included in these 39 tumor types are epithelial malignancies, lymphomas, mesotheliomas, meningiomas, stromal neoplasms, and pheochromocytoma. This test provides information regarding tumor subtype and separating squamous cell carcinomas from adenocarcinomas for certain primary sites, however the test uses an “all-encompassing” approach to tumor classification. Many of these separations are coarse distinctions that may be accomplished with the use of widely-available immunohistochemistry. For example, lymphomas may be distinguished from carcinomas with the use of immunohistochemical antibodies for cytokaratins and LCA and even finer distinctions may routinely be made with additional ancillary testing. For example, current practice is to use flow cytometry and gene rearrangement studies to subclassify non-Hodgkin's lymphoma. Mutations in the CKIT gene or PDGFR gene are diagnostic for gastrointestinal stromal tumors. This approach is useful for the undifferentiated neoplasms, in which a primary line of differentiation cannot be determined. It is noteworthy that while the test was evaluated on an independent sample set, this set had only 119 tumors to represent 30 tumor classes. Representation from each tumor type ranged from between 1 and 10 specimens, with 18 tissue types being represented by 3 samples or less, thus the reported sensitivity and specificity for a specific tumor type may only reflect the correct classification of 1 specimen. (Monzon F A, et al. Diagnosis of Metastatic Neoplasms. Arch Pathol Lab Med. 2010, 134:216-224)
The Veridex CUP assay (Raritan, N.J.) uses 10 genes tested by RT-PCR to distinguish among six different primary sites of carcinoma: lung, breast, colon, ovary, pancreas, and prostate. (Varadhachary G R, et al. Molecular profiling of carcinoma of unknown primary and correlation with clinical evaluation. J Clin Oncol 2008, 26:4442-4448; Talantov D, et al. A quantitative reverse transcriptase-polymerase chain reaction assay to identify metastatic carcinoma tissue of origin. J Mol Diagn 2006, 8:320-329) Although these studies demonstrate the feasibility of this assay, the assay itself left 48% of patients unassigned to an origin.
The CupPrint classifier, being developed by Agendia (Amsterdam, Netherlands), focuses on a finer distinction for adenocarcinoma of unknown primary. (Horlings H M, et al. gene expression profiling to identify the histogenic origin of metastatic adenocarcinomas of unknown primary. J Clin Oncol 2008, 26:4435-4441; van Laar R K, et al. Implementation of a novel microarray-based diagnostic test for cancer of unknown primary. Int J Cancer 2009, 125:1390-1397). The CupPrint classifier is developed by using the databases from another published classifier. (Ma X J, et al. Molecular classification of human cancers using a 92-gene real-time quantitative polymerase chain reaction assay. Arch Pathol Lab Med. 2006, 130:465-473). This is an RT-PCR based test applicable to formalin-fixed paraffin-embedded tissue. It is a customized eight-pack microarray containing 495 genes that were selected as highly differentiated expressed between 48 tumor types. A weighted five-nearest neighbor algorithm was used to determine the five most molecularly similar tumors in the database. They achieved an accuracy of 83% for carcinomas with a known primary and 94% for a carcinoma of unknown primary. This study focused mostly on adenocarcinomas, although urothelial carcinomas of the scheme. The classifier of this system had a systematic problem in classifying lung and pancreatic carcinomas, misclassifying respectively 63% and 100% of these carcinomas. No satisfactory explanation for this problem is provided. This limitation is important because these two primary sites most often give rise to adenocarcinoma of unknown primary.
Another previous microarray-based gene expression study proposed a tumor classifier based on a pathological tree-based framework using a schema in which neoplasms were separated in a sequential coarse to fine approach, beginning with the separation of solid malignancies from hematolymphoid malignancies. (Shedden, et al., Accurate molecular classification of human cancers based on gene expression using a simple classifier with a pathological tree-based framework, Am J Pathol 2003, 163:1985-1995) The authors further refined the epithelial malignancies into those of Mullerian (ovarian, endometrial) and non-Mullerian origin (breast, prostate, lung, colon, bladder, renal, pancreas). This approach more realistically organizes tumor classification to fit within a pathologist-based diagnostic algorithm. However, the test leaves out the first step typically performed by pathologists, the recognition of morphological subtypes of carcinomas, which include squamous cell carcinomas, urothelial carcinomas, adenocarcinomas, and neuroendocrine carcinomas.
Previous studies have focused solely on identifying site of primary origin for adenocarcinoma, proving the effectiveness of using gene expression to classify tumors within specific pathological carcinoma subtypes. (Buckhaults, et al., Identifying tumor origin using a gene expression-based classification map, Cancer Res 2003, 63:4144-4149; Giordano, et al., Organ-specific molecular classification of primary lung, colon, and ovarian adenocarcinomas using gene expression profiles, Am J Pathol 2001, 159:1231-1238; Dennis et al., Identification from public data of molecular markers of adenocarcinoma characteristic of the site of origin, Cancer Res 2002, 62:5999-6005) Molecular classifiers for site of primary origin for squamous cell carcinoma and neuroendocrine carcinomas have not been developed. One study mentioned an attempt at classifying squamous cell carcinoma of unknown primary and reported no success. (Tothill, et al., An expression-based site of origin diagnostic method designed for clinical application to cancer of unknown origin, Cancer Res 2005, 65:4031-4040) Two studies have focused on a very specific differential diagnosis; distinguishing pulmonary from head and neck primary squamous cell carcinomas. One study developed using a classifier based on the Affymetrix HG_U95Av2 oligonucleotide microarray, which focused specifically on separating lung from tongue squamous cell carcinomas (Talbot, et al., Gene expression profiling allows distinction between primary and metastatic squamous cell carcinomas in the lung, Cancer Res 2005, 65:3063-3071). Another study developed a 10-gene classifier derived from Affymetrix U133 and HG_U95Av2 data with 96% accuracy (Vachani, et al., A 10-gene classifier for distinguishing head and neck squamous cell carcinoma and lung squamous cell carcinoma, Clin Cancer Res 2007, 13:2905-2915). Neither of these studies presented a molecular classifier for neuroendocrine carcinoma of unknown primary.
The prior art also includes a miRNA classifier developed for carcinoma tissue origin by Rosetta Genomics (Rehovot, Israel). (Rosenfield N, et al. MicroRNAs accurately identify cancer tissue origin. Nature Biotechnol 2008, 26:462-469). This classifier uses a binary tree method of classification going from coarse to fine specifications. The decision at each node is a simple binary decision that can be performed using the expression levels of a few miRNAs. This classifier was tested on 400 paraffin-embedded and frozen samples from 22 different primary and metastatic tumor tissues. Overall accuracy was >90%. Accuracy for the test reached 89% in an independent data set. The approach described in this article is based on tumor cell differentiation, similar to the approach used by Shedden, (Shedden K A, et al. Accurate molecular classification of human cancers based on gene expression using a simple classifier with a pathological tree-based framework. Am J Pathol 2003, 163:1985-1995) The approach starts with the distinction of neuroendocrine from aquamous and glandular carcinomas. This study validates the approach of the present inventors in that separate miRNAs distinguish among squamous cell and adenocarcinoma of the lung. Carcinoid of the lung is recognized as distinct from other malignancies of the lung.
The present invention overcomes the shortcomings of the prior art by utilizing a pathology-based approach to tumor classification. The approach follows the algorithmic hierarchy used by pathologists and can be directly compared to or integrated with the results of HC staining. In use, the tumor is identified as a cytokeratin-positive carcinoma and subsequently subclassified into one of four basic types adenocarcinoma, squamous cell carcinoma, neuroendocrine carcinoma, and urothelial carcinoma. This subclassification is follow by the prediction of site of origin based on second tier gene expression classifiers.