The present invention is directed to a method for selecting a prey polypeptide that is able to interact with a bait polypeptide of interest, to a prey polynucleotide encoding the prey polypeptide as well as to the prey polypeptide itself. The invention also concerns plasmids used for performing the method of the invention as well as prokaryotic or eukaryotic recombinant host organisms containing such plasmids and also a collection of said recombinant host organisms consisting in a DNA library, such as a collection of recombinant haploid Saccharomyces cerevisiae. Finally, the invention is also directed to a technical medium containing the whole information concerning the interactions between metabolically related bait and prey polypeptides and/or polynucleotides coding for bait and prey polypeptides.
Most biological processes involve specific protein-protein interactions. General methodologies to identify interacting proteins or to study these interactions have been extensively developed. Among them, the yeast two-hybrid system currently represents the most powerful in vivo approach to screen for polypeptides that could bind to a given target protein. Originally developed by Fields and coworkers [(Fields et al., 1989; Chien et al., 1991). Two U.S. Pat. Nos. 5,283,173 granted on Feb. 1, 1994 (Fields, S. and Song, O.) and U.S. Pat. No. 5,468,614 granted on Nov. 21, 1995 (Fields, S. and Song, O.) herein incorporated by reference], the two-hybrid system utilizes hybrid genes to detect protein-protein interactions by means of direct activation of a reporter-gene expression (Allen et al., 1995; Transy et al., 1995). In essence, the two putative protein partners are genetically fused to the DNA-binding domain of a transcription factor and to a transcriptional activation domain, respectively. A productive interaction between the two proteins of interest will bring the transcriptional activation domain in the proximity of the DNA-binding domain and will trigger directly the transcription of an adjacent reporter gene (usually lacZ or a nutritional marker) giving a screenable phenotype. The transcription can be activated through the use of two functional domains of a transcription factor: a domain that recognizes and binds to a specific site on the DNA and a domain that is necessary for activation, as reported by Keegan et al. (1986) and Ma et al. (1987).
Recently, Rossi et al. (1997) described a different approach, a mammalian xe2x80x9ctwo-hybridxe2x80x9d system, which uses xcex2-galactosidase complementation (Ullmann et al., 1968) to monitor protein-protein interactions in intact eukaryotic cells.
The number of genome sequences of prokaryotic as well as eukaryotic host organisms available is increasing exponentially and there is a great need for new tools directed to the functional and global study of these newly characterized complete or partial genomes. As an illustrative example, the genome of the yeast Saccharomyces cerevisiae is now completely sequenced (Goffeau et al., 1996). Despite the tremendous and successful genetic work in past years, 60% of yeast genes have no assigned function and half of those encode putative proteins without any homology with known proteins (Dujon, 1996). In yeast, genetic analyses, such as suppressor or synthetic lethal screens, have suggested many functional links between gene products, some of which have later been confirmed by biochemical means. All together, these approaches have led to a rather extensive knowledge of defined biochemical pathways. However, the integration of these pathways in the complexity of a living cell remains to be accomplished. To explore the integrative functions and find the molecular factors sustaining them, some authors have attempted to design new screens. However, these screens are usually very specific and cannot apply directly to many different cellular functions. In addition, few yeast genes are essential, leading to an additional difficulty for genetic screens. Other approaches developed by cellular biologists seek to precisely localize proteins within the cell. The assumption is that colocalization of factors is indicative of functional interactions. This approach has been very successful, despite the fact that it is usually very elaborate and is rarely considered practicable for a systematic approach (Bums et al., 1994).
Bartel et al. (1996) extended the approach of the typical two-hybrid system consisting in a known protein that forms a part of a DNA-binding domain hybrid, assayed against a library of all possible proteins present as transcriptional activation domain hybrids, using the genome of bacteriophage T7, such that a second library of all possible proteins fused to the DNA-binding domain to be analyzed. This genome-wide approach to the two-hybrid searches has identified 25 interactions among the proteins of T7.
However, the currently available two-hybrid methodology is not suitable for a large scale project without specific methodological improvements. Although the two-hybrid strategy has been a major tool in proving protein:protein interactions between factors known to be functionally related (Fields and Song, 1989), its use for exhaustive and reliable search for unknown partners of a given protein is more problematic. Thus, in most cases, the two-hybrid screen constitutes an initial screen in which many different interactions are found. Among the identified candidates, only some of them are favored due to their appealing sequence. Subsequent functional assays are required for establishing their possible biological significance. For these reasons, the two-hybrid methodology has been considered a difficult, if not misleading experimental approach for screening.
Finley et al. (1994) or Bendixen et al. (1994) have described two-hybrid systems including a step of mating yeast cell colonies by replica-plating diploids, that is to say by mating colonies of yeast cells. Finley et al. (1994) have, in a first step, selected specific inserts of a DNA library using, as selection criteria, the probability for a specific insert contained in the library to comprise a large coding region of an ORF or to contain a coding region associated with a specific biological function. For example, Finley et al. (1994) have made a collection of strains, each of which expressed a different bait (in fact two cyclin dependent kinases [Cdks], namely DmCdc2 and DmCdc2c), and mated them, by replica-plating, with test strains that contained different activation-tagged Cdis (Cyclin-dependent kinase interactors). Then, each of the selected baits was used as a bait in order to screen the prey Cdis of the DNA library. Examination of the resulting interaction matrices showed that each Cdi (preys) associates specifically with a distinct spectrum of Cdks (baits).
Despite the fact that these authors state that their results suggest a number of applications of their method to genetic characterization of larger sets of proteins, it must be pointed out that these screening experiments of prior art lead to the constitution of interactor polypeptide matrices restricted, as the numerous two-hybrid systems of prior art, by the initial choice of the potentially interesting polynucleotide inserts initially identified and/or initially selected in the DNA library.
Moreover, the replica-plating step that makes use of yeast cell colonies does not allow the mating of numerous different recombinant yeast cell colonies in a single culture dish, thus rendering very fastidious, or even materially impossible the study of potential interactions between a given bait polypeptide and a wide collection of prey polypeptides, such as a collection of prey polypeptides encoded by polynucleotides originating from the whole genome of an organism such as a bacterial, viral or yeast organism.
The aim of the present invention is to provide a new method for selecting a polynucleotide encoding a prey polypeptide in a two-hybrid screening system, said method making use of mating recombinant haploid yeast cells instead of recombinant yeast cell colonies, said method providing significant advantages over prior art.
Thus, the present invention provides a method for selecting a polynucleotide encoding a prey polypeptide, said prey polypeptide being able to interact with a bait polypeptide, comprising the steps of
a) subjecting a bait polynucleotide encoding the bait polypeptide, to a two-hybrid screening method, wherein said two-hybrid screening method comprises a step of mating at least one first haploid recombinant yeast cell containing the prey polynucleotide to be assayed with a second haploid recombinant yeast cell containing the bait polynucleotide, provided that one haploid yeast cell among the first recombinant yeast cell or the second recombinant yeast cell also contains at least one detectable gene that is activated by a polypeptide including a transcriptional activation domain;
b) selecting the recombinant diploid yeast cell obtained at step a) for which the detectable gene has been expressed to a degree greater than expression in the absence of interaction between the bait polypeptide and the prey polypeptide;
c) optionally characterizing the prey polynucleotide contained in each diploid yeast cell selected at step b).
By a bait polynucleotide according to the present invention, it is intended a chimeric polynucleotide encoding a chimeric polypeptide comprising i) a DNA-binding domain that recognizes a binding site on a detectable gene that is contained in a host organism and ii) a polypeptide that is to be tested for interaction with at least one prey polypeptide.
By a prey polynucleotide according to the present invention, it is intended a chimeric polynucleotide encoding a chimeric polypeptide comprising i) a transcriptional activation domain and ii) a polypeptide that is to be tested for interaction with a bait polypeptide.
Among the numerous improvements brought by the method of the invention over the existing screening systems, said method
i) allows, in a single step, the screening of far more prey polynucleotides with a given bait polynucleotide than the prior art systems because the mating is performed between haploid yeast cells and not between yeast cells and a yeast cell colony (or between yeast colonies);
ii) As a consequence of i), the method allows the whole screening of a DNA library without the need of a first step consisting of selecting the potentially interesting polynucleotide inserts contained therein and allows an objective analysis of the potential interactor polypeptides;
iii) because the mating step is performed between haploid recombinant yeast cells and not between recombinant yeast cells and a yeast cell colony, and thus because the mating step does not take into account the former differential growth properties of different recombinant yeast cell colonies, the method of the invention is far more exhaustive as well as reproducible than the conventional two-hybrid screening systems. Moreover, an efficient mating step of a short duration that is performed between individual recombinant haploid yeast cells have two other advantages, namely a) there is a high percentage of recombinant diploid yeast cells and not only several diploid recombinant yeast cells dispersed in a single colony and b) the short period of time (less than 5 hours) that is necessary for mating the two populations of haploid yeast cells is not sufficient for a significant growth of the haploid colonies which have not successfully undergone the mating step, nor the doubling of the recombinant diploid yeast cells before plating.
As another advantageous characteristic of the mating step according to the two-hybrid screening method of the invention, the inventors have adjusted the experimental protocol in order to obtain up to a 50% increase in the efficiency of the mating procedure, this percentage being expressed as a ratio of diploid cells generated, and not as a ratio of recombinant colonies, such as expressed in prior art mating experiments.
The above-described characteristics of the two-hybrid screening method of the invention leads to a nearly perfect standardization of the diploid yeast cell population under testing. In other words, the whole characteristics of the DNA library used as starting material are perfectly reflected in the resulting recombinant diploid yeast cell population after mating. This is why the present two-hybrid screening method is of great reproducibility, from one screen to another and the interactions identified are thus of a high reliability.
In a specific embodiment of the method according to the invention, which is a further improvement over the prior art methods, the DNA library is presented as a ready-to-use biological material consisting in a collection of recombinant haploid yeast cells containing the whole inserts generated during the construction of the DNA library under the form of prey polynucleotides as defined above, said collection of yeast cells being frozen in multiple vials, each vial containing an identical biological material.
Consequently, one vial is thawed for each screening experiment and is directly used in the cell-to-cell mating step, in contrast to prior art methods, for example as described by Bendixen et al. (1994), that need a first step of separate culture, in suitable selective medium, both of the recombinant yeast cell clones containing the bait polynucleotide and of the recombinant yeast cell clones containing the chosen prey polynucleotides, then a second step of clone-to-clone replica-plating which is also performed in rich culture medium, before another step of clone-to-clone replica-plating for selecting the recombinant diploid yeast cells contained in the mated clones in a selection culture medium.
The above described characteristics of this specific embodiment of the screening method according to the invention ensures that said method is fast, exhaustive and reproducible, in contrast with the prior art techniques.
The two-hybrid screening method according to the invention is far more both quantitatively and qualitatively reproducible than the prior art methods that include a step of transformation of yeast cell with plasmidic DNA, for example, DNA originating from inserts contained in a DNA library prepared in E. coli. 
Indeed, a primary DNA library in E. coli only allows a reduced number of successive screenings without the need of a further culture of the E. coli clones in order to make more starting DNA material available. In these circumstances, the further culture of the recombinant E. coli clones of the DNA library necessarily introduce discrepancies in terms of the representativity of the different inserts initially contained in the DNA library, as the different clones may have various growth rates.
In contrast, the method according to the invention allows in a single step the preparation of a high quantity of starting DNA material under the form of recombinant haploid yeast cells, said starting material being subsequently stored in a high number of identical vials, thus ensuring that each processing of the method use strictly identical starting material representing the whole DNA library initially prepared.
The great reproducibility and exhaustivity of the above-described method allows the one skilled in the art to reiterate said method using each of the prey polynucleotide selected at step b) as a bait in order to identify and characterize polynucleotides that are systematically selected as coding for interactor polypeptides of biological significance.
The successive reiterations of the screening allow the one skilled in the art to identify important interactions between polypeptides encoded by diverse polynucleotide inserts contained in the initial DNA library, which interactions are of statistical and biological significance. Three reiterations of steps a) to c) of the method according to the invention allow the one skilled in the art to determine which polynucleotide inserts of the initial DNA library are systematically selected for interaction with an initial bait polypeptide and/or another polynucleotide insert of the initial DNA library and thus which is statistically of great metabolical and/or physiological interest.
Consequently, a specific embodiment of the method according to the invention comprises repeating at least once steps a) to c) using, for performing each reiteration, at least one previously selected and/or characterized prey polynucleotide as the bait polynucleotide.
The number of repeats of steps a) to c) is no more than 10, preferably no more than 5 and in a most preferred embodiment the number of reiterations of steps a) to c) is 3.
In a preferred embodiment of the method of the invention in which steps a) to c) are reiterated, the bait polynucleotide used that corresponds to a selected prey belongs to the following group of polynucleotide consisting in:
a) a polynucleotide that is identical to said selected prey polynucleotide;
b) a polynucleotide containing the complete ORF including said selected prey polynucleotide;
c) a polynucleotide which is any polynucleotide fragment comprised in the complete ORF including said selected prey polynucleotide.
A polynucleotide fragment of a complete ORF may be obtained either by digestion with a restriction endonuclease, as described in Sambrook et al., or by digestion with an exonuclease such as Ball , or also by DNA synthesis, such as described by Sonveaux et al. (1986), Hsiung et al. (1980), Froehler et al. (1986), Alvarado-Urbina (1986), Crea et al. (1978) or also Urdea et al. (1983) or by PCR as described in the Examples.
Using, as the bait polynucleotide, a complete ORF corresponding to a given prey selected at a given round, for example first round, of the method of the invention, for performing the next round allows the one skilled in the art to select exhaustively all the potential polypeptides that are able to interact with the translation product of said complete ORF bait polynucleotide. Consequently, all the possible polypeptides interacting with any peptide domain of the polypeptide encoded by the complete ORF including the previously selected prey are identified.
Preferably, the first screening using the method according to the invention will comprise a limited number of screenings with a limited number of bait polynucleotides, usually with bait polynucleotides encoding for bait polypeptides already characterized for being involved in a given physiological and/or metabolic pathway, like, for example, pre-mRNA splicing.
When several, preferably three, reiterations of the method of the invention are performed and thus common bait and prey polypeptide are selected, a map of all the interactions between these polypeptides may be designed, that take into account of the known and/or suspected biological function of each of the polypeptide interactor molecules. Such an interactors map may help the one skilled in the art to decipher a whole metabolical and/or physiological pathway that is functionally active within the host organism from which the initial DNA library is derived, as it will be seen in the examples presented hereunder.
Another object of the present invention consists in a representative and exhaustive genomic DNA library of an prokaryotic or an eukaryotic host organism that is prepared according to the invention.
Preferably, the method of the invention is performed using, as the genomic DNA library starting material, inserts provided by the fragmentation of the genome of a host organism that does not contain, and/or contains a small number of, intronic sequences.
Preferably, such an exhaustive genomic DNA library is prepared from the genomic DNA of a host organism endowed with a compact genome, that is to say a genome containing at least 50% of coding sequences, more preferably at least 65% of coding sequences and most preferably 75% of coding sequences. Among such an host organism having a compact genome as defined herein above may be cited prokaryotic organisms like virus and bacteria and also eukaryotic organisms such as yeast.
A further object of the present invention consists in a representative and exhaustive genomic DNA library derived from Saccharomyces cerevisiae, designated as the FRYL library, which is used when performing the two-hybrid screening method of the invention.
The invention also concerns an improved recombinant plasmid used to express the prey polynucleotides to be selected according to the method of the invention, as well as a recombinant host organism containing said plasmid.
The invention is also directed to a collection of recombinant cell clones consisting in a collection of recombinant host organisms as described hereinabove.
The present invention concerns also a recombinant diploid yeast cell selected by the method of the invention.
Are also part of the invention a polynucleotide that has been selected according to the method of the invention, as well as a polypeptide which is encoded by such a polynucleotide.
The invention is further directed to a technical medium containing an interactors map representing at least a set of interaction events that have taken place between the polypeptides encoded by the polynucleotides selected by the method of the invention.