2.1. Functional Domains in Proteins
Many biological processes involve the specific binding of proteins to one another. Examples of such processes are signal transduction, transcription, DNA replication, cytoskeletal organization, membrane transport, etc. In many cases it has been shown that specific binding is mediated by small portions of the proteins involved and that these portions can function to a large extent independently of the rest of the proteins. Such independent portions of proteins, mediating specific recognition or binding of one protein by another, have come to be called “functional domains”. A variety of functional domains have been characterized to a variety of levels of understanding. Some of these are described below.
Src homology 2 domains (SH2) domains are short (about 100 residues) amino acid sequences that were originally found in the non-membrane bound tyrosine kinase Src. Since then they have been shown to occur in about 20 other proteins. SH2 domains recognize certain phosphotyrosine-containing sites on proteins. Proteins containing SH2 domains participate in a variety of signalling pathways. For reviews discussing SH2 domains see Pawson, 1995, Nature 373:573–580; Cohen et al., 1995, Cell 80:237–248; Pawson and Gish, 1992, Cell 71:359–362; Koch et al., 1991, Science 252:668–674.
Src homology 3 (SH3) domains are another class of short amino acid sequences that were originally found by comparing the amino acid sequence of the Src protein with the sequences of Crk, Phospholipase C-γ, α-Spectrin, Myosin IB, Cdc25, and Fusl (Lehto et al., 1988, Nature 334:388; Mayer et al., 1988, Nature 332:272–275; Stahl et al., 1988, Nature 332:269–272; Rodaway et al., 1989, Nature 342:624). In addition to Src, almost 30 proteins are known to contain SH3 domains and these proteins perform a wide range of functions.
For reviews discussing SH3 domains see Pawson, 1995, Nature 373:573–580; Cohen et al., 1995, Cell 80:237–248; Pawson and Gish, 1992, Cell 71:359–362; Koch et al., 1991, Science 252:668–674.
SH3 domains have been shown to specifically bind certain proline-rich amino acid sequences (Chen et al., 1993, J. Am. Chem. Soc. 115:12591–12592; Ren et al., 1993, Science 259:1157–1161; Feng et al., 1994, Science 266:1241–1247; Yu et al., 1994, Cell 76:933–945; Sparks et al., 1994, J. Biol. Chem. 269:23853–23856; Sparks et al., 1996, Proc. Natl. Acad. Sci. USA 93:1540–1544). However, in general, the homology between different sequences that bind SH3 domains tends to be low.
This low homology would explain the specificity that has usually been observed for the interactions between SH3 domains and their natural ligands. Generally, a sequence that is identified by screening for binders to a particular SH3 domain will bind to that particular SH3 domain much more strongly that it binds to other SH3 domains. For example, Cicchetti et al., 1992, Science 257:803–806 probed a λgt11 cDNA expression library with a glutathione S-transferase fusion protein containing the 55 amino acid SH3 region of Abl and isolated two clones that produced proteins capable of specifically binding the Abl SH3 domain. Analysis of one of the clones uncovered the region of the encoded protein responsible for binding to the SH3 domain. This region, as part of a glutathione S-transferase fusion protein, bound the SH3 domain from Abl very strongly, the SH3 domain from Src less well, and the SH3 domains from Crk and neural Src very weakly.
Pleckstrin is the major substrate for Protein Kinase C in platelets. Two domains of about 100 amino acids in Pleckstrin have been found to have counterparts in a number of signal transduction and cytoskeletal proteins. These domains are known as Pleckstrin homology, or PH, domains (Haslam et al., 1993, Nature 363:309–310; Mayer et al., 1993, Cell 73:629–630). Although the sequence homology between PH domains from various proteins is low, structural studies have shown that PH domains fold into a similar conformation containing two antiparallel β sheets and a long C-terminal α helix (Gibson et al., 1994, Trends Biochem. Sci. 19:349–353). Among the proteins that have been found to have PH domains are a number of proteins with important roles in signal transduction or cytoskeletal architecture, e.g., Spectrin, Dynamin, Phospholipase C-γ, Btk, RasGAP, mSOS-1, Rac, Akt.
Leucine zippers consist of alpha helical regions of proteins in which a leucine residue appears at every seventh position along the helix. The leucines interdigitate with leucines from the leucine zipper of a different protein or another molecule of the same protein, leading to dimerization of the proteins containing the leucine zippers. Leucine zippers have been found in a number of proteins that are believed to function as transcription factors, e.g., C/EBP, Myc, Fos, Jun, GCN4. In many of these proteins, dimerization through leucine zippers has been shown to be necessary for the DNA binding activity of the transcription factor.
The binding of leucine zippers exhibits specificity in that some leucine zippers preferably bind to certain other leucine zippers. For example, the Jun-Fos heterodimer formed by the binding of the leucine zippers of Fos and Jun forms in preference to a Jun-Jun homodimer formed by the binding of the leucine zippers of two Jun proteins. Fas/APO-1(CD95) is a member of a class of transmembrane receptors that have been shown to be involved in the phenomenon of programmed cell death or apoptosis (Itoh et al., 1991, Cell 66:233–243). The tumor necrosis factor receptor 1 (TNFR-1) is also a member of this class (Baglioni, C., 1992, “The Molecules and Their Emerging Roles in Medicine,” in Tumor Necrosis Factors, B. Beutler, ed. (New York: Raven Press). Itoh, N. and Nagata, S., 1993, J. Biol. Chem. 268:10932–10937 have shown that certain amino acid sequences in the cytoplasmic domain of Fas/APO-1(CD95) are required for the programmed cell death response mediated by this receptor. Tartaglia et al., 1993, Cell 74:845–853 proposed that a similar region in TNFR-1 also is responsible for programmed cell death. This region of similarity between Fas/APO-1(CD95) and TNFR-1 has come to be called the cell death domain.
Three groups have used the yeast two-hybrid system to clone genes whose products specifically bind to the cell death domains of Fas/APO-1(CD95) and TNFR-1 (Hsu et al., 1995, Cell 81:495–504; Chinnaiyan, et al., 1995, Cell 81:505–512; Stanger et al., 1995, Cell 81:513–523). These genes were shown to induce apoptosis when overexpressed in certain cell types, a result which argues that they are intracellular transducers of death signals from Fas/APO-1(CD95) and TNFR-1.
2.1.1. WW Domains
The WW domain is a small functional domain found in a large number of proteins from a variety of species including humans, nematodes, and yeast. Its name is derived from the observation that two tryptophan residues, one in the amino terminal portion of the WW domain and one in the carboxyl terminal portion, are almost invariably conserved. At about 30 to 40 amino acids in length, it is quite small for a functional domain, most of which tend to be from 50 to 150 residues long. Often a WW domain will be flanked by stretches of amino acids rich in histidine or cysteine; these stretches might be metal-binding sites. The center of WW domains is quite hydrophobic; however, sprinkled throughout the rest of the domain are a high number of charged residues. These features are characteristic of functional domains involved in protein-protein interactions (Bork and Sudol, 1994, Trends in Biochem. Sci. 19:531–533).
Based upon their study of various WW domains, André and Springael, 1994, Biochem. Biophys. Res. Comm. 205:1201–1205 (“André and Springael”) proposed the following consensus sequence for WW domains:
WX7G(K/R)X1(Y/F)(Y/F)X1(N/D)X2(T/S)(K/R)X1(T/S)(T/Q/S)WX2P (SEQ ID NO:2)
where X represents any amino acid and bold letters represent highly conserved amino acids. André and Springael's analysis of WW domains led them to conclude that WW domains lack α-helical content, instead possessing a central β-strand region flanked by unstructured regions. Other studies predict a structure of β-strands containing charged residues flanking a hydrophobic core composed of four aromatic residues (Chen and Sudol, 1995, Proc. Natl. Acad. Sci. USA 92:7819–7823, and references cited therein).
The WW domain has been found in a wide variety of proteins of varying function. Despite this diversity of function, it appears that most proteins containing WW domains for which a function is known have something to do with either cell signalling and growth regulation or the organization of the cytoskeleton.
For example, the WW domain-containing protein dystrophin belongs to a family of cytoskeletal proteins that includes α-actinin and β-spectrin. Mutations in dystrophin are responsible for Duchenne and Becker muscular dystrophies. The dystrophin gene gives rise to a family of alternatively spliced transcripts. The longest of these encodes a protein having four domains: (1) a globular, actin-binding region; (2) 24 spectrin-like repeats; (3) a cysteine-rich Ca2+ binding region; and (4) a carboxyl terminal globular region. A short stretch of the dystrophin protein, after the spectrin-like repeats and before the Ca2+ binding region, contains a WW domain. This WW domain is in an area that has been shown to bind β-dystroglycan. This suggests that WW domains may be involved in protein-protein interactions (Bork and Sudol, 1994, Trends in Biochem. Sci. 19:531–533).
Utrophin, a protein having 70% sequence homology to dystrophin, and, like dystrophin, capable of forming tetramers via its spectrin-like repeats, also possesses a WW domain. Utrophin and dystrophin are believed to be involved in membrane stability and the transmission of contractile forces to the extracellular environment (Bork and Sudol, 1994, Trends in Biochem. Sci. 19:531–533).
YAP is a protein that was discovered by virtue of its binding to the SH3 domain of the proto-oncogene Yes (Sudol, 1994, Oncogene 9:2145–2152). Murine YAP was found to have two WW domains; interestingly, chicken and human YAP each have only a single WW domain (Sudol, et al., 1995, J. Biol. Chem. 270:14733–14741). Chen and Sudol, 1995, Proc. Natl. Acad. Sci. USA 92:7819–7823 screened a cDNA expression library with bacterially produced glutathione S-transferase fusion proteins of the WW domain from YAP. They identified and isolated two proteins from the library (WBP-1 and WBP-2) that specifically bound the YAP WW domain. Comparison of the amino acid sequences of WBP-1 and WBP-2 revealed a homologous proline-rich region in each protein. The proline-rich regions contained the shared motif PPPPY (SEQ ID NO:3). Chen and Sudol then showed that as few as ten residues containing this motif conferred upon a fusion protein the ability to specifically bind the YAP WW domain. This binding was highly specific; the motif bound to the YAP WW domain but not to the WW domain from dystrophin or to a panel of SH3 domains.
Nedd-4 is a protein which possesses three WW domains. In mouse, Nedd-4 seems to play a role in embryonic development and the differentiation of the central nervous system (Kumar et al., 1992, Biochem. Biophys. Res. Comm. 185:115–1161).
RSP5 is a protein of yeast that is involved in the phenomenon of nitrogen catabolite inactivation whereby a number of permeases that import nitrogenous compounds into the cell are inactivated when yeast are exposed to a good nitrogen source such as NH4+. RSP5 probably interacts with the transcription factor SPT3 since certain alleles of RSP5 can complement mutations in SPT3 (Eisenmann et al., 1992, Genes Dev. 6:1319–1331).
RSP5 contains three WW domains in its amino terminus. RSP5 appears to be a homolog of the vertebrate protein Nedd-4. The 6 total WW domains of RSP5 and Nedd-4 share 30% amino acid sequence identity and 50% similarity. The carboxyl terminal domains of both RSP5 and Nedd-4 are homologous to the carboxyl terminal domain of E6-AP, a human ubiquitin-protein ligase (André and Springael). A region of RSP5 known as HECT can form a high energy thioester bond with ubiquitin, arguing that RSP5 is a ubiquitin-protein ligase (Scheffner et al., 1995, Cell 75:495–505; Huibregste et al., 1995, Proc. Natl. Acad. Sci. USA 92:2563–2567).
Another yeast protein, ess1, contains a WW domain and is thought to be involved in cytokinesis and/or cell separation (Hanes et al., 1989, Yeast 5:55–72).
A search of protein databases, using the WW domains of Nedd-4 and RSP5, identified two proteins of unknown function, YKLO12W from Saccharomyces cerevesiae and Z22176 from Caenorhabditis elegans, each containing two WW domains at their amino terminus (André and Springael).
Among other proteins having WW domains, the rat transcription factor FE65 possesses an amino terminal activation region that includes a WW domain (Bork and Sudol, 1994, Trends in Biochem. Sci. 19:531–533). The human protein kiaa93 has 4 WW domains and shares other regions of sequence similarity with RSP5, and may be the human version of mouse Nedd-4 (Hoffman and Bucher, 1995, FEBS Lett. 358:153–157). The human protein HUMORF1, although of unknown function, has a roughly 350 amino acid region which is homologous to GTPase-activating proteins (André and Springael).
Citation of a reference hereinabove shall not be construed as an admission that such is prior art to the present invention.