The development of clinical diagnostic tests using paraffin-embedded, formalin-fixed biological samples and microarray gene expression has been hampered by the need to acquire large training datasets of formalin-fixed paraffin-embedded (FFPE) biological samples for developing the optimal diagnostic models. To date, very few microarray hybridization experiments have been performed using FFPE biological samples, due to RNA damage caused by formalin fixation. Instead, the microarray-based diagnostics have been developed and applied to frozen biological samples, significantly restricting their adoption. Classification of frozen and FFPE specimens is disclosed in, e.g., Ismael et al., New Engl. J. Med. 355:1071-1072 (2006); Erlander et al., J. Clin. Oncolog. 22:14S (2004); Horlings et al., J. Clin. Oncolog. 26:4435-4441 (2008); and Ma et al., International Publication WO2006/10212, published Oct. 19, 2006.
Generally, attempts made to build classifiers for FFPE biological samples have used genes that were identified using only frozen biological samples. See, e.g., Rimsza et al., 2007 ASH Annual Meeting Abstracts 110:23a (2007); Giordano et al., Am. J. Pathology 159:1231-1238 (2001).
Other groups have sought to build classifiers in other platforms. For example, Ma et al. developed a classifier as to tissue of origin based on a PCR platform, but selected the genes based on microarray data on frozen biological samples, choosing only a certain number of top performing genes for use in a RT-PCR classifier. Ma et al., Arch. Pathol. Lab. Med 130:465-473 (2006). Also, Tothill et al. disclose a support vector machine trained on frozen biological samples, which classifier is used for classifying both frozen and FFPE biological samples. Tothill et al., Cancer Res. 65:4031-4040 (2005).
Other groups sought to build a classifier for both frozen and FFPE biological samples using microRNA. See, e.g., Xi et al., RNA 13:1668-1674 (2007); Rosenfeld et al., Nature Biotechnology 26:462-469.
In order to expand the scope of microarray expression diagnostics to fixed biological samples, there is a need for a method of building optimal diagnostic classifiers using a database of expression profiles of frozen biological samples, but which method provides a classifier which can be optimally applied to fixed biological samples. The methods disclosed in this application provide for identifying genes which are highly correlated between frozen and fixed biological samples, whose expression levels can be used for building a classifier for classifying both frozen and fixed biological samples. The expression levels of these highly correlated genes can be used for building a classifier for classifying both frozen and fixed biological samples. Methods for training classifiers using the expression levels of these highly correlated genes also are provided in this application, as well as methods for classifying a frozen or fixed biological sample as to a phenotypic characterization using these classifiers.
Discussion or citation of a reference herein should not be construed as an admission that such reference is prior art to the present invention.