Much of biological regulation occurs at the level of transcription initiation. Genes contain promoter sequences which are bound by transcriptional activators and repressors (Struhl, K. (1995) Annu Rev Genet 29, 651-74; Ptashne, M. and Gann, A. (1997) Nature 386, 569-77). Activators recruit the transcriptional initiation machinery, which for protein-coding genes consists of RNA polymerase II and at least 50 additional components (Orphanides et al. (1996) Genes Dev 10, 2657-83; Roeder, R. G., (1996) Trends Biochem Sci 21, 327-35; Greenblatt, J. (1997) Curr Opin Cell Biol 9, 310-9; Hampsey, M. (1998) Microbiology and Molecular Biology Reviews 62, 465-503; Myer, V. and Young, R. A. (1998) J. Biol. Chem. 273, 27757-27760). The transcriptional initiation machinery includes factors which bind to DNA, cyclin-dependent kinases which regulate polymerase activity, and acetylases and other enzymes which modify chromatin (Burley, S. K., and Roeder, R. G. (1996) Annu Rev Biochem 65, 769-99; Kingston, R. E. et al., Genes and Development 10, 905-20; Roth, S. Y. and Allis, C. D. (1996) Cell 87, 5-8; Sgeger, D. J. and Wovleman, J. L. (1996) Bioessays 18, 875-84, Tsukiyama, T. and Wu, C. (1997) Curr. Opin. Genet. Dev. 7, 182-91; Hengartner C. J. et al., (1998) Genes and Development 9, 897-910).
The understanding of eukaryotic gene expression remains limited in several ways. The complete set of transcriptional regulators has yet to be identified. How these regulators interact with and regulate components of the transcriptional machinery is not yet clear. The functions of just a fraction of the components of the transcriptional machinery are understood, and then only with respect to a small set of genes. Cells must adjust genome expression to accommodate changes in their environment and in their programs of growth control and development, but precisely how to coordinate remodeling of genome expression is accomplished for signal transduction pathways or for the cell cycle clock has yet to be learned.
Described herein are results of genome-wide expression analysis, which was carried out to identify the key components of the transcription initiation machinery in a eukaryote, in order to dissect the regulatory circuitry of the genome. Key components of the transcription initiation machinery (key components of the RNA polymerase II transcriptional machinery) were identified in yeast, as described herein. Assessment of the requirement for key components was carried out using high density oligonucleotide arrays (HDAs) (Wodicka, L. et al (1997) Nat. Biotech., 15, 1359-67) to determine the genome-wide effects of mutations in components of the transcriptional machinery. At any given promoter, the transcriptional machinery might include any or all of the following, among others: the RNA polymerase II core enzyme, the general transcription factors (GTFs), the core Srb/mediator complex, the Srb10 CDK complex, the Swi/Snf complex and the SAGA complex. The components of the transcription apparatus which were the focus of this study were selected because they are among the key subunits of the major multiprotein complexes which have roles in transcription of protein-coding genes. One or more subunits of each of these components has been investigated for its role in genome-wide gene expression through the use of mutations which affect either the function or the physical presence of the subunit.
Results showed that components of the RNA polymerase II holoenzyme, the general transcription factor TFIID and the SAGA chromatin modification complex have roles in expression of distinct sets of genes. They further showed that the Rpb1 subunit of core RNA polymerase II, the Srb4 subunit of the Srb/mediator complex and the Kin28 subunit of TFIIH are generally required for transcription of protein-coding genes. Two were found to be required for more than half, but not all, genes (Tfa1, Taf17). Most components investigated thus far were necessary for transcription of less than a fifth of the genome (Srb5, Med6, Srb10, Swi2, Taf145, Gcn5). In this latter group, the evidence indicates that Srb5, Med6, and Taf 145 have predominantly positive roles, Srb10 has an almost exclusively negative role, and Swi2 and Gcn5 can have either a positive or a negative role in gene expression.
Work described herein shows that distinct sets of genes require the function of distinct components of the transcription machinery. Thus, coordinate regulation of large sets of genes can be accomplished by affecting the function of specific components of the transcription machinery. It follows that functional relationships exist among some genes within the sets of genes whose regulation is accomplished in this manner. Results described herein also revealed an unanticipated level of regulation that is available to the cell in addition to that provided by gene-specific regulators; the expression of specific sets of genes can be regulated by affecting the availability or function of a specific component of the general machinery. Results also showed a novel mechanism for co-ordinate regulation of specific sets of genes when cells encounter nutrient deprivation or limitation and evidence that the ultimate targets of signal transduction pathways can be identified within the initiation apparatus.
In one embodiment, the present invention is a method of determining regulatory interrelationships among genes in a cell. The method comprises the steps of:
(a) hybridizing a transcription indicator of a test cell to a set of nucleic acid probes;
(b) hybridizing a transcription indicator of a control cell to the set of nucleic acid probes,
wherein the transcription indicators are selected from the group consisting of mRNA, cDNA and cRNA, wherein the test cell contains a mutant component of the general transcription machinery and the control cell is the wild-type isogenic counterpart of the test cell;
(c) detecting amounts of the transcription indicators which hybridize to each of said set of nucleic acid probes; and
(d) identifying a gene as a member of the regulatory pathway of the general transcription factor if hybridization of the transcription indicator of the test cell to a probe comprising a portion of the gene is higher or lower than hybridization using a transcription indicator from the control cell.
In various embodiments of the method, the difference in hybridization between the control and the test cell varies. There can be, for example, at least a 2-fold difference in hybridization between the control and the test cell, at least a 3-fold difference, at least a 5-fold difference or at least a 10-fold difference in hybridization between the control and the test cell. In various embodiments of the method, the mutant component of the general transcription machinery is a mutual general transcription factor, such as a temperature sensitive mutant, a point mutant or a deletion mutant. The mutant component of the general transcription machinery can be, for example, a component of RNA polymerase II holoenzyme. The mutant component of the general transcription machinery can be a component necessary to reconstitute promoter-dependent transcription in vitro with core RNA polymerase II. Also the subject of this invention is a pair of isogenic eukaryotic cells which comprises a test cell which contains a mutant component of the general transcription machinery and a control cell which is the wild-type isogenic counterpart of the test cell. Such pairs can include a test cell in which the mutant component of the general transcription machinery is a mutant general transcription factor. They also can include a test cell in which the mutant component of the general transcription machinery is a temperature sensitive mutant, a point mutant or a deletion mutant. In such pairs, the mutant component of the general transcription machinery can be a component of RNA polymerase II holenzyme; the mutant component of the general transcription machinery can be one which is necessary to reconstitute promoter-dependent transcription id vitro with core RNA polymerase II.
The invention further relates to a method of studying the effects of drugs on cells. The method comprises:
(a) contacting a cell with a drug; and
(b) determining the effect of the drug on the cell by assessing expression of one or more of the genes which are determined to be members of the regulatory pathway of the general transcription factor according to methods described herein.
A further embodiment of the invention is a method of identifying a cellular regulatory circuit which employs a component of a subcomplex of regulatory proteins within the RNA polymerase II holoenzyme, referred to as the transcription initiation apparatus.
The method comprises:
(a) comparing genome expression signature during cellular responses to environmental or other stimuli with the genome expression signature produced by a defect in the transcription initiation apparatus; and
(b) determining differences between the two genome expression signatures and relating the differences to the defect in the transcription initiation apparatus, thereby identifying a component of the transcription initiation apparatus which is responsible for regulation of genes in the cells.
In various embodiments the cellular regulatory circuit is a yeast cell regulatory circuit, a primate (e.g., human) or other vertebrate cell regulatory circuit or a non-vertebrate cell regulatory circuit.
Thus, genome-wide expression analysis provides insights into the transcriptional regulatory circuitry of eukaryotic cells, as well as the foundation and context for interpreting mechanistic studies in control of gene expression.