Specific protein-protein interactions are fundamental to most cellular functions. Polypeptide interactions are involved in, inter alia, formation of functional transcription complexes, signal transduction pathways, cytoskeletal organization (e.g., microtubule polymerization), polypeptide hormone receptor-ligand binding, organization of multi-subunit enzyme complexes, and the like.
Investigation of protein-protein interactions under physiological conditions has been problematic. Considerable effort has been made to identify proteins that bind to proteins of interest. Typically, these interactions have been detected by using co-precipitation experiments in which an antibody to a known protein is mixed with a cell extract and used to precipitate the known protein and any proteins which are stably associated with it. This method has several disadvantages, such as: (1) it only detects proteins which are associated in cell extract conditions rather than under physiological, intracellular conditions, (2) it only detects proteins which bind to the known protein with sufficient strength and stability for efficient co-immunoprecipitation, (3) it may not be able to detect oligomers of the target, and (4) it fails to detect associated proteins which are displaced from the known protein upon antibody binding. Additionally, the precipitation techniques at best provide a molecular weight as the sole identifying characteristic. For these reasons and others, improved methods for identifying proteins which interact with a known protein have been developed.
One approach has been to use a so-called interaction trap system (also referred to as the "two-hybrid assay") based in yeast to identify polypeptide sequences which bind to a predetermined polypeptide sequence present in a fusion protein (Fields and Song (1989) Nature 340:245). This approach identifies protein-protein interactions in vivo through reconstitution of a eukaryotic transcriptional activator.
The interaction trap systems of the prior art are based on the finding that most eukaryotic transcription activators are modular. Brent and Ptashne showed that the activation domain of yeast GAL4, a yeast transcription factor, could be fused to the DNA binding domain of E. coli LexA to create a functional transcription activator in yeast (Brent et al. (1985) Cell 43:729-736). There is evidence that transcription can be activated through the use of two functional domains of a transcription factor: a domain that recognizes and binds to a specific site on the DNA and a domain that is necessary for activation. The transcriptional activation domain is thought to function by contacting other proteins involved in transcription. The DNA-binding domain appears to function to position the transcriptional activation domain on the target gene that is to be transcribed. These and similar experiments (Keegan et al. (1986) Science 231:699-704) formally define activation domains as portions of proteins that activate transcription when brought to DNA by DNA binding domains. Moreover, it was discovered that the DNA binding domain does not have to be physically on the same polypeptide as the activation domain, so long as the two separate polypeptides interact with each other. (Ma et al. (1988) Cell 55:443-446).
Fields and his coworkers made the seminal suggestion that protein interactions could be detected if two potentially interacting proteins were expressed as chimeras. In their suggestion, they devised a method based on the properties of the yeast Gal4 protein, which consists of separable domains responsible for DNA-binding and transcriptional activation. Polynucleotides encoding two hybrid proteins, one consisting of the yeast Gal4 DNA-binding domain fused to a polypeptide sequence of a known protein and the other consisting of the Gal4 activation domain fused to a polypeptide sequence of a second protein, are constructed and introduced into a yeast host cell. Intermolecular binding between the two fusion proteins reconstitutes the Gal4 DNA-binding domain with the Gal4 activation domain, which leads to the transcriptional activation of a reporter gene (e.g., lacZ, HIS3) which is operably linked to a Gal4 binding site.
All yeast-based interaction trap systems in the art share common elements (Chien et al. (1991) PNAS 88:9578-82; Durfee et al. (1993) Genes & Development 7:555-69; Gyuris et al. (1993) Cell 75:791-803; and Vojtek et al. (1993) Cell 74:205-14). All use (1) a plasmid that directs the synthesis of a "bait": a known protein which is brought to DNA by being fused to a DNA binding domain, (2) one or more reporter genes ("reporters") with upstream binding sites for the bait, and (3) a plasmid that directs the synthesis of proteins fused to activation domains and other useful moieties ("prey"). All current systems direct the synthesis of proteins that carry the activation domain at the amino terminus of the fusion, facilitating the expression of open reading frames encoded by, for example, cDNAs.
The prior art systems differ in their specifics. These details are typically relevant to their successful use. Baits differ in their DNA binding domains. For example, systems use baits that contain native E. coli LexA repressor protein (Durfee et al. (1993) Genes & Development 7:555-69; Gyuris et al. (1993) Cell 75:791-803). LexA binds tightly to appropriate operators (Golemis et al. (1992) Mol. Cell. Biol. 12:3006-3014; Ebina et al. (1983) J. Biol. Chem. 258:13258-13261), and carries a dimerization domain at its C terminus (Brent R. (1982) Biochimie 64:565-569; Little J et al. (1982) Cell 29:11-22; and Thliveris et al. (1991) Biochimie 73:449-455). In yeast, LexA and most LexA derivatives enter the nucleus, but are not necessarily nuclear localized. Others use baits that contain a portion of the yeast GAL4 protein (Chien et al. (1991) PNAS 88:9578-82; Durfee et al. (1993) Genes & Development 7:555-69; and Harper et al. (1993) Cell 75:805-16). This portion, encoded by residues 1-147, is sufficient to bind tightly to appropriate DNA binding sites, localize fused proteins to the nucleus, and direct dimerization; it also contains a domain that weakly activates transcription in mammalian cell extracts in vitro, and it is thus conceivable that this domain may increase transcription resulting from weakly interacting proteins.
Reporter genes differ in the phenotypes they confer. The products of some reporter genes (e.g., HIS3, LEU2) allow cells expressing them to be selected by growth on appropriate media, while the products of others (e.g. lacZ) allow cells expressing them to be visually screened. Reporters also differ in the number and affinity of upstream binding sites (e.g., lexa operators) for the bait, and in the position of these sites relative to the transcription startpoint (Gyuris et al., supra). Finally, they differ in the number of molecules of the reporter gene product necessary to score the phenotype. These differences affect the strength of the protein interactions the reporters can detect .
Preys differ in the activation domains they carry, and in whether they contain other useful moieties such as nuclear localization sequences and epitope tags. Some activation domains are stronger than others. Although strong activation domains should allow detection of weaker interactions, their expression can also harm the cell due to poorly understood transcriptional effects, either by titration of cofactors necessary for transcription of other genes ("squelching") (Gill et al. (1988) Nature 334:721-724) or by toxic effects that result when strong activation domains are brought to DNA (Berger et al. (1990) Cell 61:1199-208). Thus, it is possible that strong activation domains may prevent detection of some interactions. Prey proteins also differ in whether they are expressed constitutively, or conditionally. Conditional expression allows the transcription phenotypes obtained in selections (or "hunts") for interactors to be ascribed to the synthesis of the tagged protein, thus reducing the number of false positive cells that grow because their reporters are aberrantly transcribed.
Although most two hybrid systems use yeast, there are also mammalian variants. In one, interaction of VP16 derivatives with a Gal4-derived bait drives expression of reporters that direct the synthesis of Hygromycin B phosphotransferase, Chloramphenicol acetyltransferase, or CD4 cell surface antigen (Fearon et al. (1992) PNAS 89:7958-62). In the other, interaction of VP16-tagged derivatives with Gal4-derived baits drives the synthesis of SV40 T antigen, which in turn promotes the replication of the prey plasmid, which carries an SV40 origin (Vasavada et al. (1991) PNAS 88:10686-90).
Several industrially significant uses of two hybrid systems have emerged. One use is to identify new protein targets for pharmaceutical intervention. Typically, the two-hybrid method is used to identify novel polypeptide sequences which interact with a known protein (Silver et al. (1993) Mol. Biol. Rep. 17:155; Durfee et al. (1993) Genes Devel. 7:555; Yang et al. (1992) Science 257:680; Luban et al. (1993) Cell 73:1067; Hardy et al. (1992) Genes Devel. 6; 801; Bartel et al. (1993) Biotechniques 14:920; and Vojtek et al. (1993) Cell 74:205). Variations of the two-hybrid method have been used to identify mutations of a known protein that affect its binding to a second known protein (Li B and Fields S (1993) FASEB J 7:957; Lalo et al. (1993) PNAS 90:5524; Jackson et al. (1993) Mol. Cell. Biol. 13:2899; and Madura et al. (1993) J. Biol. Chem. 268:12046). Two-hybrid systems have also been used to identify interacting structural domains of two known proteins (Bardwell et al. (1993) Med. Microbiol. 8:1177; Chakraborty et al. (1992) J. Biol. Chem. 267:17498; Staudinger et al. (1993) J. Biol. Chem. 268:4608; and Milne et al. (1993) Genes Devel. 7:1755) or domains responsible for oligomerization of a single protein (Iwabuchi et al. (1993) Oncogene 8:1693; Bogerd et al. (1993) J. Virol. 67:5030). Variations of two-hybrid systems have been used to study the in vivo activity of a proteolytic enzyme (Dasmahapatra et al. (1992) PNAS 89:4159).