Virtually all biological processes in living organisms are controlled by the function of specifically interacting proteins. Specific analysis of protein interactions makes it possible to isolate unknown proteins and to assign them to functional groups and also to elucidate the molecular mechanism of action of known proteins. Cellular signal transduction which comprises the transfer of extracellular signals to specific intracellular alterations is the key mechanism for controlling a cell during development and in the response to environmental changes. The transfer of these signals is controlled by strictly regulated cascades of specifically interacting proteins. In addition, virtually all important cellular functions which are usually coupled to signal transduction are carried out by controlled protein interactions (Pawson and Scott 1997). These include, inter alia, control of the cell cycle, protein synthesis and protein degradation, prevention or induction of apoptosis, transport processes, detection and induction of stimuli, gene expression, mRNA processing, DNA synthesis and DNA repair and the entire energy metabolism. All these processes are very dynamic, i.e. they are subject to alterations which are embedded in the overall state of the cell. This is reflected, at the protein level, by the regulated alteration of the composition of the function-performing complexes (Ashman, Moran et al. 2001).
Proteins and their interactions are subject to great and dynamic alterations in the cell which are frequently brought about by signal transduction cascades. Their function or the composition of protein complexes may also change due to allosteric effects or a change of intracellular location. The regulation of the function of proteins is particularly dependent on activation by enzymes of the signal transduction cascades which catalyze protein modifications on specific target proteins. Post-translational modifications often have a dramatic influence on the activity of a multiplicity of proteins. Regulated protein phosphorylation, protein acetylation, protein methylation, protein sulfation, protein acylation, protein prenylation, protein ribosylation, protein glycosylation, protein ubiquitination or proteolytic activation and inactivation are known modifications some of whose effects and regulatory mechanisms are only little understood. Modification-dependent protein interactions have been described for a multiplicity of transduction pathways in different cellular adaptation events (Hunter 2000).
The activation or deactivation of particular signal transduction cascades must be exactly controlled in the cell, both by way of time and its form. A multiplicity of pathological processes in cells is caused by interference in the control of signal transduction and may lead to metabolic disorders, cancerogenesis, immunological disorders or neurological deficits. Pathological processes of this kind may be caused in particular by specific mutations in the genes coding for proteins having an important function in neuronal signal transduction. Thus, for example, particular mutations in the genes of NMDA receptor subunits alter the composition and signal properties of the NMDA receptor protein complex (Migaud, Charlesworth et al. 1998). Molecular mechanisms responsible for the precise time control of cellular events and often controlled by feedback mechanisms are only beginning to be understood (Marshall 1995). Time-limited protein-protein interactions, controlled, for example, by reversible post-translational modifications exert a key function here (Yasukawa, Sasaki et al. 2000) (Hazzalin and Mahadevan 2002).
Currently there is a lack of suitable methods for identifying and analyzing regulated, time-limited or modification-dependent protein interactions.
A multiplicity of methods of characterizing protein-protein interactions have been described.
Biochemical methods of protein purification, coupled to mass-spectrometric analytical methods, enable protein complexes to be characterized (Ashman, Moran et al. 2001). Thus it was possible, for example, to isolate, with the aid of a tandem affinity purification (TAP) (Rigaut, Shevchenko et al. 1999), denaturing one-dimensional gel electrophoresis and tryptic digestion of single bands in combination with mass-spectrometric analysis, to isolate a multiplicity of protein complexes from yeast and to determine the components thereof (Gavin, Bosche et al. 2002). Using an optimized immunoprecipitation protocol, mass spectrometry and the Western blot technique, a multiplicity of the components of the neuronal NMDA receptor signal processing complex was characterized (Husi, Ward et al. 2000). These biochemical-biophysical methods seem particularly suitable for the analysis of stable and static protein complexes, but have a few experimental disadvantages and disadvantages in principle. Thus, all biochemical methods are experimentally very expensive and require a very large amount of biological starting material. Moreover, already optimized purification methods (e.g. the TAP method) require the corresponding fusion proteins to be transgenically and stably expressed in the organisms of choice. In the analysis of complex tissues, weakly expressed or cell type-specifically expressed proteins may readily be below the limit of detection. A fundamental disadvantage of all biochemical methods arises from the necessity of cell disruption or of the solubilization of large membrane complexes. Weak or transient interactions may readily be lost during the course of the biochemical workup which usually requires a plurality of steps (Ashman, Moran et al. 2001). Analyses of the interactions of proteins with extreme physicochemical properties, such as, for example, membrane proteins or proteins having a high total charge, in each case need an optimized protocol or, in individual cases, are not possible.
Previous mass-spectrometric methods have only limited suitability for characterization of post-translationally modified proteins, since modifications such as specific phosphate radicals may readily be lost during fragmentation (Ashman, Moran et al. 2001). The composition and functional activity of protein complexes are subject to constant dynamics and are controlled by a multiplicity of regulatory mechanisms. Moreover, it became clear recently that specific physicochemical properties or specific environments, such as specific membrane topologies and membrane compositions for example, strongly influence protein-protein interactions. Thus, for example, the lipid composition of the different intracellular membrane systems plays a large part in the assembly of protein complexes (Huttner and Schmidt 2000). Owing to their particular molecular composition, “lipid rafts”, subdomains in the cell membrane, allow completely different interactions than neighboring regions in the lipid bilayer (Simons and Ikonen 1997). Biochemical methods have only limited suitability for detailed analysis of the specific interactions or interaction domains of proteins within a complex or for analysis of possible transient interactions of associated proteins, due to the relatively high experimental costs.
Microarrays are another tool for systematically analyzing protein-protein interactions, protein-peptide interactions or the interactions of proteins with low molecular weight substances. This method involves applying in-vitro translated or recombinantly produced proteins or peptides, antibodies, specific ligands or low molecular weight substances to a support material, analogously to DNA microarrays. Complex protein mixtures or substance libraries may thus be simultaneously screened, inter alia, for specific interactions (Ashman, Moran et al. 2001) (Xu, Piston et al. 1999). However, the analyses using protein or substance arrays, which are carried out completely in vitro, likewise have great disadvantages. Thus, proteins are prepared or analyzed under completely artificial conditions in this method. Furthermore, the appearance of high unspecific background signals, the limited sensitivity and difficulties in detecting proteins having particular physicochemical properties greatly limit the number of possible applications of these array-based in-vitro analytic methods for detecting and analyzing protein interactions (Ashman, Moran et al. 2001).
There are furthermore various methods known which involve detecting protein interactions in the cell, and thus in vivo, by indirectly activating genetic reporters. Most of the familiar methods are limited to detection of binary protein interactions in karyoplasma and are based on the functional modularity of transcription factors, such as the 2-hybrid system in yeasts, for example (Fields and Song 1989). The 2-hybrid system involves expressing in yeast cells one or more proteins or protein sections as fusion proteins with a DNA binding protein without transactivation capability and using them as “bait” for detecting interacting components. A second protein or protein fragment is expressed as fusion protein with a transcriptional transactivation domain and is the “prey” component. The “prey” component frequently is a fusion protein which comprises a gene product of a complex cDNA library in addition to the transcriptional transactivation domain. The interaction of the bait and prey fusion proteins results in functional reconstitution of the activating transcription factor. The reporter genes used in the 2-hybrid system are enzymes which can be used to detect said protein interaction either by growth selection or by a simple colorimetric assay and which are very sensitive. A reporter gene frequently used in the 2-hybrid system, which makes possible a positive growth selection of cells with specific protein-protein interactions, is the histidine 3 gene which is an essential enzyme for histidine biosynthesis and whose protein-protein interaction-dependent expression enables the cells to grow on histidine-deficient medium. The most frequently used reporter gene whose protein-protein interaction-dependent expression is detectable by a simple calorimetric assay is the beta-galactosidase gene.
The 2-hybrid system was originally developed in yeast, but subsequently variants of the 2-hybrid system have also been described for application in E. coli and in higher eukaryotic cells (Luban and Goff 1995) (Karimova, Pidoux et al. 1998) (Fearon, Finkel et al. 1992) (Shioda, Andriole et al. 2000).
A substantial disadvantage of classical 2-hybrid-based systems is, inter alia, the relatively high rate of false-negatively and false-positively detected interaction partners. This is due, on the one hand, to the high sensitivity of the reporters, but also to spatial coupling of the interaction and the basal transcription machinery. Recently, interaction systems for yeast cells have been described, which spatially decouple the place of interaction from the activation of the reporter genes used or the selection mechanisms used for detection (Maroun and Aronheim 1999). Related systems in yeast also allow at least one interaction partner which may be an integral or membrane-associated protein to be analyzed (Hubsman, Yudkovsky et al. 2001) (Ehrhard, Jacoby et al. 2000).
The functional complementation of proteins or enzymes which is also the basis of classical 2-hybrid-based systems is a method of analyzing interactions in living cells and bacteria, which has been known and applied for some time (Ullmann, Perrin et al. 1965) (Fields and Song 1989) (Mohler and Blau 1996) (Rossi, Charlton et al. 1997) (Pelletier, Campbell-Valois et al. 1998). Transcomplementation means the separation of an intact and functional protein or protein complex into two artificial subunits at the gene level. The two subunits here are per se inactive with respect to the function of the complete protein but are active with respect to their corresponding subunit function and are incapable of self-reconstitution. By providing in a protein interaction-dependent manner close spatial proximity between the two separated subunits, the fusion of such subunits to proteins or protein domains interacting with one another results in complementation of the divided protein, thereby rendering it functional again. The regaining of the function of the protein (e.g. an enzyme) by protein interaction is utilized here directly or indirectly for detection of said interaction (Mohler and Blau 1996) (Rossi, Charlton et al. 1997). The best-known example is transcomplementation of the transcription factor Gal4 which is the basis of the classical yeast 2-hybrid system (Fields and Song 1989).
In addition, transcomplementations and, coupled thereto, methods of detecting protein interactions have been described for different proteins with enzymatic activity, inter alia beta-galactosidase (bGal), dihydrofolate reductase (DHFR) and beta-lactamase (bLac) (Michnick and Remy 2001) (Rossi, Charlton et al. 1997) (Michnick and Galarneau 2001). Protein interactions can be detected indirectly in these systems after transcomplementation of the abovementioned enzymes by way of growth selection or of fluorimetric or calorimetric enzyme detection assays. Depending on the substrate used, detection may usually be carried out only after disruption of the cells and addition of the substrate in vitro.
In contrast, the DHFR-based system enables the interaction or transcomplementation of proteins to be detected also in vivo. For this purpose, a cell-permeable fluorescently labeled antagonist (methotrexate) is added which binds only to the intact protein. The disadvantage here, however, is the fact that the antagonist is not a substrate of the enzyme but binds the enzyme as a competitive inhibitor. Therefore no enzymatic enhancement of the detection signal whatsoever is produced. This results in a strong reduction in the sensitivity of this detection method compared to detection of positive clones by way of positive growth selection under the appropriate culture conditions. Moreover, the detection signal in the DHFR-based system for analyzing protein interactions is visible only directly after addition of the fluorescently labeled inhibitor, i.e. it is not possible to detect dynamic processes in cells, which are frequently accompanied by dynamic or transient protein interactions, by means of these nonpermanent signals which are detectable only for a short time. In addition, the inhibitor described (methotrexate) is a highly cytotoxic substance which, after application, greatly impairs and alters cell growth, metabolism and other intracellular processes and thus distorts the normal in vivo conditions.
Although a DHFR-based method of analyzing protein interactions, which is based on positive growth selection of cells by transcomplemented DHFR, and not on application of methotrexate, offers correspondingly higher sensitivity, it requires periods of several days and weeks, a fact which makes this method appear not particularly suitable for application in high throughput methods, for example in high throughput screening. Owing to these disadvantages, these systems are also of only very limited suitability for analyzing transient or stimulus-induced interactions, since short-time transcomplementation is insufficient in order to enable positive cells to be selected by growth over a longer period. On the other hand, binding of fluorescent antagonists such as methotrexate by the transcomplemented DHFR can be detected only when finding the exact moment of the transient or stimulus-induced interaction.
These problems in detecting transient, i.e. time-limited, protein interactions also relate to the classical 2-hybrid system and to its variants known according to the prior art, since here too reporter systems are used which do not produce any permanent detection signal.
Only when they appear do transient protein interactions result in short-time expression of the reporter genes used, with the gene products of previously used reporter genes being able to generate a detection signal only during their lifetime, meaning that, for the 2-hybrid system and its variants, and in particular for the analysis of weak and transient protein interactions, only a relatively short time is available for detecting the protein interaction-dependent signals. This is a big problem, in particular when a large number of different potential interaction partners of a bait protein need to be tested simultaneously for interaction, as in screening methods, in particular in high throughput screening methods, for example. Screening methods must enable a multiplicity of different potential interactions which possibly may also take place sequentially and possibly possess different strengths of interaction and lifetimes to be analyzed simultaneously at a defined point in time. However, if particular detection signals are detectable only in a narrow “time window”, which possibly do not even overlap for different interactions to be detected, the 2-hybrid system, its variants and the known transcomplementation-based detection systems for analyzing protein interactions may not be able to record certain interactions, in particular weak and transient protein interactions.
Another selection system which is based on a specific type of protein transcomplementation is the split ubiquitin system originally developed for studies in yeast and applied recently also in mammalian cells. This system utilizes the separation of ubiquitin into two nonfunctional moieties, an N- and a C-terminal fragment (Nub and Cub) (Johnsson and Varshavsky 1994) (Rojo-Niersbach, Morley et al. 2000). Ubiquitin is a small protein which labels proteins typically fused to its C terminus for cellular degradation. This biological mechanism is used in the split ubiquitin system for detecting protein interactions. In one embodiment of the split ubiquitin system, a first fusion protein comprising the C-terminal fragment Cub, a selection marker protein or fluorescent protein coupled thereto and a first interaction partner, and a second fusion protein comprising the N-terminal fragment Nub and the second interaction partner are heterologously expressed in the cell. A specific interaction of the corresponding fusion proteins restores a correctly folded ubiquitin which the proteasome can detect and process, with the coupled, initially active reporter then being degraded. Accordingly, the system allows negative growth selection focused on the absence of the selection marker or observation of the disappearance of a fluorescent reporter. These embodiments of the split ubiquitin system, described in scientific publications, thus disclose two very weak points, firstly negative selection making rapid and unambiguous detection of relevant interactions difficult and secondly no signal increase taking place in the cell after the interaction. The latter point makes it virtually impossible to detect weak or transient interactions.
The patent WO 95/29195 discloses an embodiment of the split ubiquitin system in which two different fusion proteins which comprise in each case one interaction partner and one part of the ubiquitin are expressed in a cell. One of the two fusion proteins here furthermore comprises a reporter protein which can be proteolytically removed by a ubiquitin-specific protease. Said reporter protein is removed here by a ubiquitin-specific protease only after a specific protein-protein interaction has occurred, and only then is activated. However, this embodiment of the split ubiquitin system does not overcome the fact that a reporter molecule can be released or activated only once per each interaction which has taken place. The latter makes it virtually impossible to detect weak or transient interactions. In addition, the reporter is coupled directly to one of the interaction partners, meaning firstly that the amount of reporter strongly depends on the level of expression and the stability of the interaction partner to which it has been fused. This leads to the possibility of an unstable or readily degradable protein delivering a false-positive signal in the analysis.
Another system, described for mammalian cells, is based on activation and dimerization of modified type I cytokine receptors (Eyckerman, Verhee et al. 2001). Activation of the STAT3 signal pathway in this system can only take place if an interaction between the bait receptor fusion protein and the prey fusion protein occurs. The prey protein is fused to gp130 which carries STAT binding sites. The receptor-associated Janus kinases phosphorylate gp130 only after bait-prey interaction, resulting in binding, phosphorylation and subsequent nuclear translocation of STAT3 transcription factors. STAT-regulated reporter genes are expressed as a function of the bait-prey interaction. To identify novel intracellular interaction partners, a selection strategy was established which confers puromycin resistance (Eyckerman, Verhee et al. 2001). Although the method allows detection of a protein-protein interaction on the membrane by way of expression of a reporter gene, it nevertheless requires at least one interaction partner to couple to said membrane receptor. Moreover, the complex quaternary structure of the receptor-kinase-GP130 multimer is not suitable for analyzing difficult protein classes such as membrane proteins. Owing to insufficient amplification and stability of the signal, the system is incapable of analyzing transient or weak interactions.
Methods based on the transfer of energy quanta of a donor molecule to an acceptor molecule when said molecules are brought into very close proximity have theoretically few limits. These methods may use various variants of the green fluorescent protein (GFP) from Aequorea victoria, which are capable of fluorescence resonance energy transfer (FRET) owing to their specific spectral properties (Siegel, Chan et al. 2000). A similar method is based on an energy coupling of the bioluminescence of the luciferase-luciferin reaction as energy donor and GFP as energy acceptor. The energy transfer is referred to as bioluminescence resonance energy transfer (BRET) effect (Xu, Piston et al. 1999). However, the addition of appropriate luciferase substrates is required here. The substantial disadvantages of these methods are the result of the sensitivity and difficulty of detection. The detection of FRET effects in vivo requires both very strong expression and complicated analysis. Strong overexpression of proteins in heterologous cells often results in the formation of aggregates, wrong folding or misdirected subcellular localization. The method provides no possibility of signal enhancement or signal amplification, a decisive disadvantage which rules out detection of weak interactions or interactions of weakly expressed proteins. In order to be able to detect a protein interaction of spectrally compatible GFP fusion proteins by way of FRET effects in the cell, background subtractions and photobleaching analyses must be carried out (Haj, Verveer et al. 2002). Owing to the complex technique, the method is not suitable for high throughput methods of analyzing protein interactions and, in addition, requires complicated analyses and great experience, this being an obstacle to broad application.
All indirect methods previously described have the fundamental disadvantage of coupling conversion of the protein interaction to a detection method which requires constant activation or allows in particular transient interactions to be analyzed only in a very narrow time window. It is therefore possible only in a very limited way, if at all, to analyze protein interactions in post-mitotic cells or to identify transient interactions. Automatable detection of interactions underlying unknown kinetics is thus not possible.
The methods previously described of analyzing or detecting protein-protein interactions thus has at least one or more of the following disadvantages:                The interactions do not take place in vivo, or at least not in mammalian cells.        Large amounts of biological material are required.        The interactions must be permanent or the analysis must take place at exactly the right time.        Measuring the interaction requires complicated measurements and, respectively, apparatus.        Detection sensitivity is very limited.        The analysis of endogenously very weakly expressed genes is restricted.        Only binary interactions are detected.        The rate of false-positive or false-negative interactions is high.        The analysis of cell type-specifically expressed genes in a complex tissue assemblage or in cell lines is virtually impossible in biochemical methods.        The detection methods are automatable only with difficulty.        