Transcription factors play a major role in cellular function by inducing the transcription of specific mRNAs. Transcription factors, in turn, are controlled by distinct signaling molecules. The STATs (Signal Transducer and Activator of Transcription) constitute a family of transcription factors necessary to activate distinct sets of target genes in response to cytokines and growth factors [Darnell et al. WO 95/08629, (1995)]. The STAT proteins are activated in the cytoplasm by phosphorylation on a single tyrosine residue [Darnell et al., Science 264:1415 (1994)]. The responsible kinases are either ligand-activated transmembrane receptors with intrinsic tyrosine kinase activity, such as EGF- or PDGF-receptors, or cytokine receptors that lack intrinsic kinase activity but have associated JAK kinases, such as those for interferons and interleukins [Ihle, Nature 377:591-594 (1995)]. One distinctive characteristic of the STAT proteins are their apparent lack of requirement for changes in second messenger, e.g., cAMP or Ca.sup.++, concentrations. Presently, there are seven known mammalian STAT family members. The recent discovery of a Drosophila STAT protein, suggests that these proteins have played an important role in signal transduction since the early stages of our evolution [Darnell, PNAS, 94:11767-11769 (1997)]. Each STAT protein contains a SRC homology domain (SH2 domain). When activated, the STAT proteins are phosphorylated, and form homo- or heterodimeric structures in which the phosphotyrosine of one partner binds to the SH2 domain of the other. The reciprocal SH2-phosphotyrosine interactions between two STAT proteins result in the formation of an active dimer that translocates to the nucleus and activates specific gene expression [Darnell et al., Science 264:1415 (1994)] by binding to a canonical recognition site for the STAT dimer. This canonical recognition site encompasses 9-10 base pairs (TTCN.sub.3-4 GAA) of DNA [Horvath et al., Genes & Devel. 9:984 (1995); Seidel et al., Proc. Natl. Acad. Sci. USA 92:3041 (1995); Ihle, Cell 84:331 (1996); Mikita et al., Mol. Cell. Biol. 16:5811 (1996)]. Analysis of the binding of activated STATs to DNA targets has revealed that the STAT binding sites can extend over two or more adjacent canonical sites [Xu et al., Science 273:750 (1996); Meier and Groner, Mol. Cell. Biol. 14:128 (1994); Symes et al., Molecular Endocrinology 8;1750 (1994); Dajee et al., Molecular Endocrinology 10:171 (1996); John et al., EMBO J. 15:5627 (1996)].
STAT proteins serve in the capacity as a direct messengers between the cytokine or growth factor receptor present on the cell surface, and the cell nucleus. However, since each cytokine and growth factor produce a specific cellular effect by activating a distinct set of genes, the means in which such a limited number of STAT proteins mediate this result remains a mystery. Indeed, at least twenty-five different ligand-receptor complexes signal the nucleus through the seven known mammalian STAT proteins [Yan et al., Cell 84:421-430 (1996)].
There is increasing evidence that mammalian transcription factors activate transcription and achieve biological specificity by interactions with other transcription factors, trans-activators or the general transcription machinery [McKnight, Genes & Development 10:367 (1996); Roeder, Trends in Biochemical Sciences 21:327 (1996)]. Although the molecular basis for these phenomena is poorly understood, direct protein:protein interactions among multiple promoter bound proteins appear to mediate this synergistic activation [Tijan and T. Maniatis, Cell 77:5 (1994)].
In the case of the STATs, a small N-terminal domain has been shown to mediate a number of important protein:protein interactions that influence transcriptional outcome [Leung et al., Science 273:750 (1996); Vinkemeier et al., EMBO J. 15: 5616 (1996)]. This domain allows cooperative interactions between STAT dimers bound to adjacent target sites on DNA, leading to a drastically prolonged half-life of the protein-DNA complex [Vinkemeler et al., EMBO J. 15: 5616 (1996)]. Functional assays exploring the induction of the hepatic Spi 2.1 gene revealed the necessity for cooperative STAT binding to two adjacent recognition sites for a full growth hormone response [Bergad et al., J. Biol. Chem. 270, 24903 (1995)]. In addition, it was observed that these cooperative contacts affect the binding site selection of different STATs on a natural promoter that contains multiple potential STAT recognition sites [Xu et al., Science 273:750 (1996)]. Each of the oligomerized STAT-1, -4, and -5 dimers were shown to bind to a different combination of canonical sites [Xu et al., Science 273:750 (1996)]. Deletion of the N-terminal .about.100 residues of STAT-1 and STAT-4 abolishes cooperative binding to DNA [Xu et al., Science 273:750 (1996); Vinkemeier et al., EMBO J. 15: 5616 (1996)]. The truncated protein fully retains binding to a single target site as a dimer, suggesting that the N-terminal domain is dispensable for dimer formation and DNA binding [Xu et al., Science 273:750 (1996); Vinkemeier et al., EMBO J. 15: 5616 (1996)] , but is necessary for interaction between STAT dimers and binding site discrimination [Xu et al., Science 273:750 (1996)]. Also, the N-domain of STAT-1 is required for interaction between STAT-1 and the transcriptional co-activator protein CBP, a large (.about.2500 amino acids) polypeptide with transacetylase activity [Zhang et al., Proc. Natl. Acad. Sci. USA 93:15092 (1996)]. Additionally, the amino-terminal region of STAT-2 is involved in binding to the intracellular region of the interferon-.alpha. receptor [Leung et al., Mol. Cell. Biol. 15:1312 (1995)].
Therefore, there is a need to obtain agonists and antagonists that can modulate the effect of STAT proteins during specific gene activation. In particular, there is a need to obtain drugs that will directly interact with the important N-terminal domain of STAT proteins. Unfortunately, identification of such drugs have heretofore relied on serendipity and/or systematic screening of large numbers of natural and synthetic compounds. A far superior method of drug-screening relies on structure based drug design. In this case, the three dimensional structure of a protein or protein fragment is determined and potential agonists and/or potential antagonists are designed with the aid of computer modeling [Bugg et al., Scientific American, December:92-98 (1993); West et al., TIPS, 16:67-74 (1995)]. However, heretofore the three-dimensional structure of a STAT protein or fragment thereof has remained unknown, essentially because no such protein crystals had been produced of sufficient quality to allow the required X-ray crystallographic data to be obtained.
Therefore, there is presently a need for obtaining an N-terminal STAT domain fragment that can be crystallized to form a crystal with sufficient quality to allow such crystallographic data to be obtained. Further, there is a need for such crystals. Furthermore there is a need for the determination of the three-dimensional structure of such crystals. Finally, there is a need for procedures for related structural based drug design based on such crystallographic data.
The citation of any reference herein should not be construed as an admission that such reference is available as "Prior Art" to the instant application.