1. Field of the Invention
The present invention relates to a method for detecting the interaction of three or more proteins in an in vivo system through the use of fused genes encoding hybrid proteins.
2. Description of the Related Art
A fundamental area of inquiry in biology is the analysis of interactions between proteins. Proteins are complex macromolecules made up of covalently linked chains of amino acids. Each protein assumes a unique three dimensional shape determined principally by its sequence of amino acids. Many proteins consist of smaller units termed domains, which are continuous stretches of amino able to fold independently from the rest of the protein. Some of the important forms of proteins are as enzymes, polypeptide hormones, nutrient transporters, structural components of the cell, hemoglobins, anti-bodies, nucleoproteins, and components of viruses.
Multiple protein interactions require three or more proteins to associate. A large number of non-covalent bonds form between the proteins when three or more protein surfaces are precisely matched, and these bonds account for the specificity of recognition. Multiple-protein interactions are involved, for example, in the assembly of enzyme subunits; in antigen-antibody reactions; in forming the supramolecular structures of ribosomes, filaments, and viruses; in transport; and in the interaction of receptors on a cell with growth factors and hormones. Products of oncogenes can give rise to the neoplastic transformation through multiple-protein interactions. For example, some oncogenes encode protein kinases whose enzymatic activity on cellular target proteins leads to the cancerous state. Another example of a protein-protein interaction occurs when a virus infects a cell by recognizing a polypeptide receptor on the surface, and this interaction has been used to design antiviral agents.
Protein-protein interactions have been generally studied in the past using biochemical techniques such as cross-linking, co-immunoprecipitation and co-fractionation by chromatography. A disadvantage of these techniques is that interacting proteins often exist in very low abundance and are, therefore, difficult to detect. Another major disadvantage is that these biochemical techniques involve only the proteins, not the genes encoding them. When an interaction is detected using biochemical methods, the newly identified protein often must be painstakingly isolated and then sequenced to enable the gene encoding it to be obtained. Another disadvantage is that these methods do not immediately provide information about which domains of the interacting proteins are involved in the interaction. Another disadvantage is that small changes in the composition of the interacting proteins cannot be tested easily for their effect on the interaction.
There is evidence that transcription can be activated through the use of two functional domains of a transcription factor: a domain that recognizes and binds to a specific site on the DNA and a domain that is necessary for activation, as reported by Keegan, et al., Science, 231, 699-704 (1986) and Ma and Ptashne, Cell, 48, 847-853 (1987). The transcriptional activation domain is thought to function by contracting other proteins involved in transcription. The DNA-binding domain appears to function to position the transcriptional activation domain on the target gene which is to be transcribed. In a few cases now known, these two functions (DNA-binding and activation) reside on separate proteins. One protein binds to the DNA, and the other protein, which activates transcription, binds to the DNA-bound protein, as reported by McKnight et al., Proc. Natl. Acad. Sci. USA, 89, 7061-7065 (1987); another example is reviewed by Curran et al., Cell, 55, 395-397 (1988).
Transcriptional activation has been studied using the GAL4 protein of the yeast Saccharomyces cerevisiae. The GAL4 protein is a transcriptional activator required for the expression of genes encoding enzymes of galactose utilization, see Johnston, Microbiol. Rev., 51, 458-476 (1987). It consists of an N-terminal domain which binds to specific DNA sequences designated UAS.sub.G (UAS stands for upstream activation site, G indicates the galactose genes) and a C-terminal domain containing acidic regions, which is necessary to activate transcription, see Keegan et al. (1986), supra, and Ma and Ptashne. (1987), supra. As discussed by Keegan et al., the N-terminal domain binds to DNA in a sequence-specific manner but fails to activate transcription. The C-terminal domain cannot activate transcription because it fails to localize to the UAS.sub.G see for example, Brent and Ptashne, Cell, 43, 729-736 (1985). However, Ma and Ptashne have reported (Cell, 51, 113-119 (1987); Cell, 55, 443-446 (1988)) that when both the GAL4 N-terminal domain and C-terminal domain are fused together in the same protein, transcriptional activity is induced. Other proteins also function as transcriptional activators via the same mechanism. For example, the GCN4 protein of Saccharomyces cerevisiae as reported by Hope and Struhl, Cell, 56, 885-894 (1986), the ADR1 protein of Saccharomyces cerevisiae as reported by Thukral et al., Molecular and Cellular Biology. 9, 2360-2369, (1989) and the human estrogen receptor, as discussed by Kumar et al. Cell, 51, 941-951 (1987) both contain separable domains for DNA binding and for maximal transcriptional activation.
Recently, protein-protein interactions have been studied using the widely used yeast two-hybrid systems of Fields et al., Nature 340, 245-246 (1989), also disclosed in U.S. Pat. No. 5,283,173 by Fields et al. The yeast two-hybrid system detects binary (X/Y) interactions between proteins through functional reconstitution of transcription factor GAL4 by associating two fusion proteins, GAL4-DNA binding domain, (BD)-X, and GAL4-activation domain, (AD)-Y. Thus, the yeast two-hybrid system offers a sensitive genetic selection method to detect and clone physically interactive proteins. Fields et al., Nature 340, 245-246 (1989); Fields et al., Proc. Natl. Acad. Sci. 88, 9578-9582.
However, the two-hybrid system is limited so far as to detect protein interactions involving two components only. The two-hybrid system cannot detect protein (Z) mediated interactions between proteins (X/Y) where the proteins indirectly interact, e.g. X does not contact Y. The two-hybrid system also cannot detect protein interactions which require modification of the X or Y protein by the Z protein to interact, or have complex conformational requirements for interaction. None of the aforementioned articles suggests a genetic system to detect three or more protein interactions in vivo using transcriptional activation as an assay.
A genetic system that is capable of rapidly detecting which of multiple proteins interact with a known protein, determining which of multiple domains of the proteins interact, and providing the genes for the newly identified interacting proteins has not been available prior to the present invention.
Accordingly, to avoid the disadvantages inherent in the biochemical techniques for detecting multiple protein interactions, it would be desirable to have a method for detecting three or more protein interactions using a genetic system. The genetic system described here is based on transcriptional activation. Transcription is the process by which RNA molecules are synthesized using a DNA template. Transcription is regulated by specific sequences in the DNA which indicate when and where RNA synthesis should begin. These sequences correspond to binding sites for proteins, designated transcription factors, which interact with the enzymatic machinery used for the RNA polymerization reaction.