Of the 50,000 to 100,000 genes present in the genome in higher vertebrates, approximately 10,000 are expressed in any tissue at a particular development stage. The level of expression (mRNA concentration or abundance) of each of these varies widely from 1 copy per cell (rare) to 50-300 (middle abundant) to 1,000 (highly abundant) (Hastie and Bishop, 1976, Cell 9: 761). The large number of biochemical, enzymatic and antigenic differences between normal cells and their transformed counterparts indicates that changes in level of expression of many of these genes (aside from mutations, deletions, amplifications or other structural alterations) is involved in development of the fully malignant cell. U.S. Pat. No. 4,981,783 teaches methods to analyze the expression of a large representation of these genes in order to characterize tissue or cells as normal, benignly or malignantly transformed, or at risk for transformation or other phenotypes (e.g., responsiveness to various drug or biological therapeutic agents; potential for metastasis and likely site) which are of clinical importance.
The initial work utilized a dimethylhydrazine induced mouse colon tumor as a model (Augenlicht and Kobrin, 1982, Cancer Res. 42: 1088). A cDNA library of the expressed genes of this tumor was constructed and 400 random selections made. Standard methodology was used to determine relative levels of expression of each of these 400 sequences in a number of normal and neoplastic tissues. A semi-quantitative scale was used, and analyses were repeated a number of times. Several general conclusions were drawn. First, approximately 15% of the sequences changed in expression in the colon tumor as compared to the normal mouse colonic mucosa. Most of these (12%) were modest quantitative shifts. This extent of change is similar to that documented in a number of other systems of transformation, including rat hepatomas (Capetanaki and Alonso, 1980, Nuc. Acids Res. 8: 3193; Jacobs and Birnie, 1980, Nuc. Acids Res. 8: 3087), human lymphoid neoplasia (Hanania, et al., 1981, Proc. Natl. Acad. Sci., USA, 78: 6504), and most important, even the relatively well understood transformation of primary chick embryo fibroblasts by the Rous sarcoma virus (RSV) (Groudine and Weintraub, 1980, Proc. Natl. Acad. Sci., USA, 77: 5351). Hence, even when the etiology of transformation is well understood (e.g., the introduction of the src gene by RSV and the expression of its product, pp60 sarc), the cell rapidly exhibits a large number of changes in gene expression which may include alterations in as many as 1000 sequences. Among those sequences whose normal expression was relatively restricted to the colon, there were many decreases (nineteen) in expression in colon tumors as well as modest increases (twenty three). Fewer changes (nine) were seen in the tumors among those sequences expressed in other normal tissues, but the alterations were of much larger magnitude (Augenlicht and Kobrin, supra).
In moving to the human, several significant changes were made. First, the number of sequences from a reference cDNA library made from the HT-29 human colon carcinoma cell line was increased to 4,000 (Augenlicht, et al., 1987, Cancer Res., 47: 6017). This provided an 80% probability that every abundant and middle abundant sequence in this colon carcinoma cell line was represented in the data set. Second, methods were developed to accomplish the analysis of expression of each of these 4,000 sequences in very small human biopsies which yield 50-100 ng of poly A+ RNA. Utilizing a computerized scanning and image processing system, (Augenlicht, et al., supra; U.S. Pat. No. 4,981,783 ) relative level of expression of each of the 4000 sequences was quantitated in each biopsy. The number of sequences screened was reduced from 4000 in a series of experiments. First, all 4000 clones were evaluated in two biopsies of normal mucosa from individuals at low genetic risk for colon cancer; in two biopsies of benign adenoma from patients with the autosomal dominant disease familial polyposis; and in two biopsies of two different colon carcinomas. Sequences which were expressed at near background levels (low abundance) in all six of these biopsies, or which were modestly above background and showed no evidence of alteration in level of expression among the tissues, were eliminated. This left 379 clones for additional screening with other biopsies (Augenlicht, et al., supra).
Several facts emerged from the large data base generated. First, the overall number of sequences which changed in expression between human colon carcinomas and normal human colonic mucosa was approximately 7%, which is the same order of magnitude as the extent of change seen in other systems cited above. Furthermore, the change was progressive, in that fewer alterations were seen in benign tumors (adenomatous polyps) than in malignant carcinomas when both were compared to normal tissue--the flat mucosa from patients at low genetic risk for colon cancer. Finally, the flat mucosa from patients with familial polyposis who are at very high risk for development of colon cancer showed much greater changes in gene expression when compared to low-risk normal mucosa than either the benign polyps that arise in this disease or the carcinomas. Hence, the high-risk tissue, having undergone many constitutive alterations in association with the inherited gene defect, may be primed for progression along any of many pathways to malignancy.
This same methodology and reference library was then used to analyze changes in gene expression in HT-29 and SW-480 colon carcinoma cell lines induced to differentiate in vivo with sodium butyrate. Again, many alterations in gene expression were found, but a comparison of the in vivo and in vitro data bases allowed selection of eight sequences whose relative levels of expression characterized colonic cells as either differentiated or fully transformed. Further, the quantitative extent of change in these sequences in vivo and in vitro was similar, with a linear correlation coefficient which was significant for a comparison of the in vivo and in vitro data at the p&lt;0.01 level. The in vivo data establish that the in vitro results are not tissue culture artifacts but do in fact bear a relationship to the human disease. Conversely, the in vitro data could be confirmed by standard Northern blot analysis, thus validating the scanning and image processing methodology, and also reduce the complexity of cell types and human tissue variability for further analysis of these sequences.
In accordance with the present invention, it has been discovered that expression of one of the aforementioned sequences can be used to determine the state of benign and malignant colon tumors as compared to the normal colonic mucosa, thus fulfilling an important need to monitor such tumors.