The present invention relates to an improved method for the identification and optionally the characterisation of interacting molecules designed to detect positive clones from the rather large numbers of false positive clones isolated by conventional two-hybrid systems. The method of the invention relies on a novel combination of selection steps used to detect clones that express interacting molecules from false positive clones. The present invention provides for high-throughput interaction screens for the reliable identification of interacting molecules, which in turn can lead to the identification of substances inhibiting said interactions. Such inhibitors can find their use in the formulation of a pharmaceutical composition. The present invention further relates to kits useful for carrying out the method of the invention.
Protein-protein interactions are essential for nearly all biological processes like replication, transcription, secretion, signal transduction and metabolism. Classical methods for identifying such interactions like co-immunoprecipitation or cross-linking are not available for all proteins or may not be sufficiently sensitive. Said methods further have the disadvantage that only by a great deal of energy, potentially interacting partners and corresponding nucleic acid fragments or sequences may be identified. Usually, this is effected by protein sequencing or production of antibodies, followed by the screening of an expression-library.
An important development for the convenient identification of protein—protein interactions was the yeast two-hybrid (2H) system presented by Fields and Song (1989). This genetic procedure not only allows the rapid demonstration of in vivo interactions, but also the simple isolation of corresponding nucleic acid sequences encoding for the interacting partners. The yeast 2H system makes use of the features of a wide variety of eukaryotic transcription factors which carry two separable functional domains: one DNA binding domain as well as a second domain which activates the RNA-polymerase complex (activation domain). In the classical 2H system a so-called “bait” protein comprising of a DNA binding domain (GAL4bd or lex A) and a protein of interest “X” are expressed as a fusion protein in yeast (“bait hybrid”). The same yeast cell also simultaneously expresses a so called “fish” protein comprising of an activation domain (GAL4ad or VP16) and a protein “Y” (“fish hybrid”). Upon the interaction of a bait protein with a fish protein, the DNA binding and activation domains of the fusion proteins are brought into close proximity and the resulting protein complex triggers the expression of the reporter genes, e.g. HIS3 or lacZ. Said expression can be easily monitored by cultivation of the yeast cells on selective medium without histidine as well as upon the activation of the lacZ gene. The genetic sequence encoding, for example, an unknown fish protein, may easily be identified by isolating the corresponding plasmid and subsequent sequence analysis. Meanwhile, a number of variants of the 2H system have been developed. The most important of those are the “one hybrid” system for the identification of DNA-binding proteins, the “tri-hybrid” system for the identification of RNA-protein-interactions, the “reverse two hybrid” system, and some systems transferring the 2H approach to cellular systems other than yeast, namely bacterial and mammalian (Li and Hershowitz, 1993; SenGupta et al., 1996; Plutz et al., 1996; Vidal et al., 1996; Dove et al., 1997; Fearon et al., 1992). It should be noted that some 2H systems do not utilise a transactivation approach. For example, the functional reconstitution of enzymatic activity.
The classical 2H system for the identification of protein—protein-interaction, has, until today, only been carried out on a laboratory scale. Although recent developments have taken on the challenges in large scale 2H screening (e.g. Bartel et al., 1996), a successful large scale search of interacting proteins, for example on the basis of a library vs. library screen, has not been reported. However, on the laboratory scale, it is only possible to screen for interactions between gene products which are known and/or which are suspected to interact, as the probability of finding an interaction by random chance is less than 10−3. The true power of the 2H system, namely finding previously unsuspected interactions, and even interactions between previously unknown proteins and protein families, in screening whole genomes, can only be brought forward in a large scale approach.
One major difficulty in implementing large scale 2H systems lies in eliminating the large numbers of false positives not representing any biologically meaningful interactions between binding partners. In currently applied 2H systems, in which proteins of interest, optionally encoded by cDNA libraries, are fused to a DNA binding domain and an activation domain, respectively, false positives may arise by several different mechanisms:                A peptide or protein cloned into the bait hybrid might itself have activating properties, activating transcription of a reporter gene independent of an interaction with the fish hybrid (herein: “False Positives Class 1”).        A peptide or protein cloned into the fish hybrid might itself constitute a DNA binding domain, binding to the DNA binding site or to the basal portion of the promoter, activating transcription of a reporter gene independent of an interaction with the bait hybrid (herein: “False Positives Class 2”).        A peptide or protein cloned into the fish hybrid might specifically bind to the DNA binding domain of the bait hybrid, or, vice versa, a peptide or protein cloned into the bait hybrid might specifically bind to the activation domain of the fish hybrid, reconstituting activation of the reporter gene independent of an interaction between the bait and fish proteins. This may include binding to epitope tags fused to the DNA binding domain or activation domain (herein: “False Positives Class 3”).        Certain peptides or proteins are able to bind non-specifically to many different other structures (commonly denoted: “Sticky Proteins”). These will result in a large number of positives with one common genetic element.        
A number of strategies have been previously described which remove some of the above classes of false positives (Allen et al., 1995; Bartel et al., 1993).                The use of two reporter genes (Bartel et al., 1993): One of these genes usually expresses a selectable marker (e.g. HIS3) and the other reporter gene a measurable marker activity (e.g. lacZ), and the reporter gene promoters usually are different. By scoring positives according to activation of both reporter genes, this allows removal of a certain part of the False Positives Class 2 since an interaction with both of the different promoters is less likely to occur.        The use of selectable markers and preselection (Bartel et al., 1996): This method employs replica plating of yeast clones that express one fusion protein from plates containing selective medium corresponding to the selectable marker introduced with the plasmid that encoded said one fusion protein to plates containing selective medium corresponding to a reporter gene product (e.g. LEU2 as selectable marker on plasmid, HIS3 as reporter gene). Yeast clones that showed growth on selective medium corresponding to the reporter gene product where identified as False Positives Class 1 or Class 2, respectively, and were subsequently not used for interaction mating.        The use of counterselectable genes and preselection (Vidal et al., 1996a): Two populations of mating competent yeast host cells of different mating type are provided that contain (a) the bait hybrid plasmid and one counterselectable reporter gene in the population of cells of the first mating type, and (b) the fish hybrid plasmid and the same or another counterselectable reporter gene in the population of cells of the second mating type. When these first and second populations are kept individually under conditions such that expression of said counterselectable reporter gene inhibits the growth of said host cells, False Positives Class 1 and False Positives Class 2 are hypothetically removed.        The use of a second, different bait hybrid protein: Several approaches have been described, all of which are performed on positive clones after scoring of positives: (a) curing of the bait hybrid plasmid, transfection with a second bait hybrid plasmid containing an unrelated bait protein fused to the same DNA binding domain as in the original bait hybrid plasmid; expression of the reporter gene(s) indicates False Positives Class 2 as well as a Sticky Protein or False Positive Class 3 being fused to the activation domain (Harper et al., 1993); (b) curing of the bait hybrid plasmid, transfection with a second bait hybrid plasmid containing an unrelated bait protein fused to a different DNA binding domain that binds to a second DNA binding site controlling a second site comprising the reporter gene; expression of the reporter gene indicates a Sticky Protein or certain types of False Positives Class 3 being fused to the activation domain (Le Douarin et al., 1995); (c) transfection with a control hybrid plasmid encoding a fusion protein comprising the bait protein and a second DNA binding domain that binds to a second DNA binding site controlling a second reporter gene; lack of expression of the second reporter gene indicates a False Positive Class 1 (Hurd et al., 1997).        
All of these strategies are time and labour consuming, which is particularly inconvenient in cases where large numbers of clones are to be analysed, and, in order to eliminate all false positives, a combination would have to be used, necessitating even more handling steps. An efficient method for the elimination of false positives is, however, inherently more necessary in a library vs. library screen as compared to the screening of one bait protein against a library of fish proteins, because the combination of randomly chosen peptides or proteins/protein fragments with a DNA binding domain is much more likely to be able to auto-activate expression of a reporter gene than randomly chosen peptides or proteins/protein fragments fused to an activation domain. As a consequence, false positive rates of up to 50% would be expected in a library vs. library screen, which, together with the high total number of clones, does render such screen unfeasible with conventional 2H methods.
Moreover, as yeast is not the host cell of choice in a variety of investigations (e.g. when a mammalian protein suspected to interact with a second protein requires substantial post translational modifications), it would be desirable for a high throughput 2H system to be versatile with regard to the type of host cell employed. All systems put forward so far that are geared to eliminate the difficulties of 2H screening, although mostly claiming to be applicable to all types of cells, have been designed towards the specific biological properties of the yeast two hybrid system, and cannot necessarily be transferred to, for example, bacterial or mammalian cell systems.
The technical problem underlying the present invention was therefore to provide a method that allows the fast and reliable elimination of false positives. This method should, moreover, be suitable for large-scale library vs. library screens using a high-throughput approach. Preferably, this method would be applicable to a range of different host cell systems, such as yeast, bacterial, mammalian, plant and insect cells. Such method could routinely be applied to the identification of pathways of molecular interactions in biological systems, and the interconnections between such pathways. Ultimately, the identification of molecules involved in interactions that form part of such pathways can be employed in order to pinpoint targets for pharmaceuticals.
The solution to said technical problem is achieved by providing the embodiments characterised in the claims.