The present method relates to the identification of protein-protein interactions and inhibitors of these interactions that, preferably, are specific to a cell type, tissue type, stage of development, or disease state or stage.
Proteins and protein-protein interactions play a central role in the various essential biochemical processes. For example, these interactions are evident in the interaction of hormones with their respective receptors, in the intracellular and extracellular signalling events mediated by proteins, in enzyme substrate interactions, in intracellular protein trafficking, in the formation of complex structures like ribosomes, viral coat proteins, and filaments, and in antigen-antibody interactions. These interactions are usually facilitated by the interaction of small regions within the proteins that can fold independently of the rest of the protein. These independent units are called protein domains. Abnormal or disease states can be the direct result of aberrant protein-protein interactions. For example, oncoproteins can cause cancer by interacting with and activating proteins responsible for cell division. Protein-protein interactions are also central to the mechanism of a virus recognizing its receptor on the cell surface as a prelude to infection. Identification of domains that interact with each other not only leads to a broader understanding of protein-protein interactions, but also aids in the design of inhibitors of these interactions.
Protein-protein interactions have been studied by both biochemical and genetic methods. The biochemical methods are laborious and slow, often involving painstaking isolation, purification, sequencing and further biochemical characterization of the proteins being tested for interaction. As an alternative to the biochemical approaches, genetic approaches to detect protein-protein interactions have gained in popularity as these methods allow the rapid detection of the domains involved in protein-protein interactions.
An example of a genetic system to detect protein-protein interactions is the xe2x80x9cTwo-Hybridxe2x80x9d system to detect protein-protein interactions in the yeast Saccharomyces cerevisiae (Fields and Song, 1989, Nature 340:245-246; U.S. Pat. No. 5,283,173 by Fields and Song). This assay utilizes the reconstitution of a transcriptional activator like GAL4 (Johnston, 1987, Microbiol. Rev. 51:458-476) through the interaction of two protein domains that have been fused to the two functional units of the transcriptional activator: the DNA-binding domain and the activation domain. This is possible due to the bipartite nature of certain transcription factors like GAL4. Being characterized as bipartite signifies that the DNA-binding and activation functions reside in separate domains and can function in trans (Keegan et al., 1986, Science 231:699-704). The reconstitution of the transcriptional activator is monitored by the activation of a reporter gene like the lacZ gene that is under the influence of a promoter that contains a binding site (Upstream Activating Sequence or UAS) for the DNA-binding domain of the transcriptional activator. This method is most commonly used either to detect an interaction between two known proteins (Fields and Song, 1989, Nature 340:245-246) or to identify interacting proteins from a population that would bind to a known protein (Durfee et al., 1993, Genes Dev. 7:555-569; Gyuris et al., 1993, Cell 75:791-803; Harper et al., 1993, Cell 75:805-816; Vojtek et al., 1993, Cell 74:205-214).
Another system that is similar to the Two-Hybrid system is the xe2x80x9cInteraction-Trap systemxe2x80x9d devised by Brent and colleagues (Gyuris et al., 1993, Cell 75:791-803). This system is similar to the Two-Hybrid system except that it uses a LEU2 reporter gene and a lacz reporter gene. Thus protein-protein interactions leading to the reconstitution of the transcriptional activator also allow cells to grow in media lacking leucine and enable them to express xcex2-galactosidase. The DNA-binding domain used in this system is the LexA DNA-binding domain, while the activator sequence is obtained from the B42 transcriptional activation domain (Ma and Ptashne, 1987, Cell 51:113-119). The promoters of the reporter genes contain LexA binding sequences and hence will be activated by the reconstitution of the transcriptional activator. Another feature of this system is that the gene encoding the DNA-binding domain fusion protein is under the influence of an inducible GAL promoter so that confirmatory tests can be performed under inducing and non-inducing conditions.
In yet another version of this system developed by Elledge and colleagues, the reporter genes HIS3 and lacz (Durfee et al., 1993, Genes Dev. 7:555-569) are used. The transcriptional activator that is reconstituted in this case is GAL4 and protein-protein interactions allow cells to grow in media lacking histidine and containing 3-aminotriazole (3-AT) and to express xcex2-galactosidase. 3-AT inhibits the growth of his3 auxotrophs in media lacking histidine (Kishore and Shah, 1988, Annu. Rev. Biochem. 57:627-663).
In a different two-hybrid assay, a URA3 reporter gene under the control of Estrogen Response Elements (ERE) has been used to monitor protein-protein interactions. Here, the DNA-binding domain is derived from the human estrogen receptor. The authors of the ERE assay propose that inhibition of the protein-protein interactions can be identified by negative selection on 5-FOA medium (Le Douarin et al., 1995, Nucleic Acids Res. 23:876-878), but do not provide any details.
A version of the two-hybrid approach called the xe2x80x9cContingent Replication Assayxe2x80x9d that is applicable in mammalian cells has also been reported (Nallur et al., 1993, Nucleic Acids Res. 21:3867-3873; Vasavada et al., 1991, Proc. Natl. Acad. Sci. USA 88:10686-10690). In this case, the reconstitution of the transcription factor in mammalian cells due to the interaction of the two fusion proteins leads to the activation of the SV40 T antigen. This antigen allows the replication of the activation domain fusion plasmids. Another modification of the two-hybrid approach using mammalian cells is the xe2x80x9cKaryoplasmic Interaction Selection Strategyxe2x80x9d that also uses the reconstitution of a transcriptional activator (Fearon et al., 1992, Proc. Natl. Acad. Sci. USA 89:7958-7962). Reporter genes used in this case have included the gene encoding the bacterial chloramphenicol acetyl transferase, the gene for cell-surface antigen CD4, and the gene encoding resistance to Hygromycin B. In both of the mammalian systems, the transcription factor that is reconstituted is a hybrid transcriptional activator in which the DNA-binding domain is from GAL4 and the activation domain is from VP16.
In all of the assays described above, the identity of one (or both) of the proteins being tested for interaction is known. All of the assays mentioned above can be used to identify novel proteins that interact with a known protein of interest. In a variation of the xe2x80x9cInteraction Trapxe2x80x9d system, a xe2x80x9cmating-gridxe2x80x9d strategy has been used to characterize interactions between proteins that are thought to be involved in the Drosophila cell cycle (Finley and Brent, 1994, Proc. Natl. Acad. Sci. USA 91:12980-12984). This strategy is based on a technique first established by Rothstein and colleagues (Bendixen et al., 1994, Nucleic Acids Res. 22:1778-1779) who used a yeast-mating assay to detect protein-protein interactions. Here, the DNA-binding and activation domain fusion proteins were expressed in two different haploid yeast strains, a and xcex1, and the two were brought together by mating. Thus, interactions between proteins can be studied in this method. However, even in this method, the identities of at least one of the proteins in the interacting pairs of proteins was known prior to analyzing the interactions between pairs of proteins.
Stanley Fields and coworkers have recently performed an analysis of all possible protein-protein interactions that can take place in the E. coli bacteriophage T7 (Bartel et al., 1996, Nature Genet. 12:72-77). Randomly sheared fragments of T7 DNA were used to make libraries in both the DNA-binding domain and the activation domain plasmids and a genome-wide two-hybrid assay was performed by use of a mating strategy. The DNA-binding and the activation domain fusions were transformed into separate yeast strains of opposite mating type. The DNA-binding domain hybrids containing yeast transformants were then divided into groups of 10. The groups were screened (by the mating strategy outlined above) against a library of activation domain hybrids numbering around 105 transformants. By this method, 25 interactions were characterized among the proteins of T7. While this study provides a method to screen more than one DNA-binding domain hybrid against more than one activation domain hybrid, it does not address the issues involved in screening complex libraries against each other. This is an important limitation due to the value of enabling the detection and isolation of interactants from cDNA libraries prepared from complex organisms like human beings. Indeed, the prior art has taught away from using complex populations of proteins as hybrids to the DNA-binding domain, since random hybrids to the DNA binding domain produce a large percentage of false positives (hybrids that have transcriptional activity in the absence of an interacting protein) (Bartel et al., 1993, xe2x80x9cUsing the two hybrid system to detect protein-protein interactions,xe2x80x9d in Cellular Transduction in Development, Ch. 7, Hartley, D. A. (ed.), Practical Approach Series xviii, IRL Press at Oxford University Press, New York, N.Y., pp. 154-179 at 171; Ma and Ptashne, 1987, Cell 51:113).
None of the prior art systems provides a method that not only isolates and catalogues all possible protein-protein interactions within a population, be it a tissue/cell-type, disease state, or stage of development, but also allows the comparison of such interactions between two such populations thereby allowing the identification of protein-protein interactions unique to any particular tissue/cell-type, disease state, or stage of development. In contrast, such a method is provided by the present invention.
Accordingly, it is one of the objectives of this invention to devise a genetic method to identify and isolate preferably all possible protein-protein interactions within a population of proteins, or between two different populations of proteins, be it a tissue/cell-type, disease state or stage of development.
It is another objective of the present invention to perform a comparative analysis of the protein-protein interactions that occur two or more different tissue/cell-types, disease states, or stages of development.
It is also an objective of this invention to identify and isolate in a rapid manner the genes encoding the proteins involved in interactions that are specific to a tissue/cell-type, disease state, or stage of development.
It is yet another objective of this invention to. provide a method for the concurrent identification of inhibitors of the protein-protein interactions that characterize a given population, be it a tissue/cell type, disease state, or stage of development. These inhibitors may have therapeutic value.
Citation of a reference herein shall not be construed as an admission that such is prior art to the present invention.
The present invention provides methods and means to detect and isolate the genes encoding the proteins that interact with each other between two populations of proteins, using the reconstitution of a selectable event. This selectable event is the formation of a transcription factor. In contrast to the prior art, in which problems with false positives and low throughput limited the complexity of the populations that could be analyzed, each of the two populations of proteins has a complexity of greater than 10, and preferably has a complexity of at least 1,000. The reconstitution of a transcription factor occurs by interaction of fusion proteins expressed by chimeric genes. In a preferred embodiment, the types of fusion proteins used are DNA-binding domain hybrids and activation domain hybrids of transcriptional activators. Libraries of genes encoding hybrid proteins are preferably constructed in both a DNA-binding domain hybrid plasmid vector and in an activation domain hybrid plasmid vector. In a preferred embodiment, two types of haploid yeast strains, a and a respectively, are each transformed with a different one of the two libraries to create two yeast libraries. The two yeast libraries are then mated together to create a diploid yeast strain that contains both the kinds of fusion genes encoding the hybrid proteins. If the two hybrid proteins can interact (bind) with each other, the transcriptional activator is reconstituted due to the proximity of the DNA-binding and the activation domains of the transcriptional activator. This reconstitution causes transcription of reporter genes that, by way of example, enable the yeast to grow in selective media. In a preferred aspect, the activity of a reporter gene is monitored enzymatically. The isolation of the plasmids that encode these fusion genes leads to the identification of the genes that encode proteins that interact with each other.
Thus, in a specific embodiment, the invention is directed to a method of detecting one or more protein-protein interactions comprising (a) recombinantly expressing within a population of host cells (i) a first population of first fusion proteins, each said first fusion protein comprising a first protein sequence and a DNA binding domain in which the DNA binding domain is the same in each said first fusion protein, and in which said first population of first fusion proteins has a complexity of at least 1,000; and (ii) a second population of second fusion proteins, each said second fusion protein comprising a second protein sequence and a transcriptional regulatory domain of a transcriptional regulator, in which the transcriptional regulatory domain is the same in each said second fusion protein, such that a first fusion protein is co-expressed with a second fusion protein in host cells, and wherein said host cells contain at least one nucleotide sequence operably linked to a promoter driven by one or more DNA binding sites recognized by said DNA binding domain such that interaction of a first fusion protein with a second fusion protein results in regulation of transcription of said at least one nucleotide sequence by said regulatory domain, and in which said second population of second fusion proteins has a complexity of at least 1,000; and (b) detecting said regulation of transcription of said at least one nucleotide sequence, thereby detecting an interaction between a first fusion protein and a second fusion protein.
The present invention also provides a method to isolate concurrently inhibitors of such protein-protein interactions that occur in, are characteristic of or are specific to a given population of proteins. By way of example, preferably all the yeast diploids that harbor fusion proteins that interact with each other are pooled together and exposed to candidate inhibitors. Exemplary candidate inhibitors include chemically synthesized molecules and genetically encoded peptides. After treatment with candidate inhibitors, the yeast cells harboring interacting hybrid proteins are selected for the inactivation of the reporter gene, preferably by transfer to appropriate selective media. Preferably, the same media also selects for the presence of the plasmids that encode the interacting proteins, and the peptide-encoding peptides in the case of the screening for peptide inhibitors expressed from expression plasmids. Successful inhibition events are thus monitored by the inactivation of the reporter gene.
The major advantages of these methods are as follows. From a population of proteins characteristic of a particular tissue or cell-type, all possible detectable protein-protein interactions that occur can be identified and the genes encoding these proteins can be isolated. Thus, parallel analyses of two cell types enumerates the protein-protein interactions that are common to both and those that are specific to both (differentially expressed in one cell type and not the other). Such an analysis has value since protein-protein interactions specific to a disease state can serve as therapeutic points of intervention.
Furthermore, inhibitors of such protein-protein interactions can be isolated in a rapid fashion. Such inhibitors can be of therapeutic value or serve as lead compounds for the synthesis of therapeutic compounds. This system can also be used to identify novel peptide inhibitors of protein-protein interactions. One advantage of this method over existing methods is that peptides or chemicals are identified by an ability to block protein-protein interactions. In many existing methods, molecules are identified by an ability only to bind to one of a pair of interacting proteins; such binding does not necessarily imply that the protein-protein interaction will be blocked by the same agent. Another advantage of the method is that multiple protein-protein interactions can be screened against a prospective inhibitor in a single assay.