This invention relates to methods for detecting protein interactions and isolating novel proteins.
In general, the invention features methods for detecting interactions among proteins.
Accordingly, in one aspect, the invention features a method of determining whether a first protein is capable of physically interacting with a second protein. The method includes (a) providing a host cell which contains (i) a reporter gene operably linked to a DNA-binding-protein recognition site; (ii) a first fusion gene which expresses a first fusion protein, the first fusion protein comprising the first protein covalently bonded to a binding moiety which is capable of specifically binding to the DNA-binding-protein recognition site; and (iii) a second fusion gene which expresses a second fusion protein, the second fusion protein including the second protein covalently bonded to a gene activating moiety and being conformationally-constrained; and (b) measuring expression of the reporter gene as a measure of an interaction between the first and said second proteins.
Preferably, the second protein is a short peptide of at least 6 amino acids in length and is less than or equal to 60 amino acids in length; includes a randomly generated or intentionally designed peptide sequence; includes one or more loops; or is conformationally-constrained as a result of covalent bonding to a conformation-constraining protein, e.g., thioredoxin or a thioredoxin-like molecule. Where the second protein is covalently bonded to a conformationally constraining protein the invention features a polypeptide wherein the second protein is embedded within the conformation-constraining protein to which it is covalently bonded. Where the conformation-constraining protein is thioredoxin, the invention also features an additional method which includes a second protein which is conformationally-constrained by disulfide bonds between cysteine residues in the amino-terminus and in the carboxy-terminus of the second protein.
In another aspect, the invention features a method of detecting an interacting protein in a population of proteins, comprising: (a) providing a host cell which contains (i) a reporter gene operably linked to a DNA-binding-protein recognition site; and (ii) a fusion gene which expresses a fusion protein, the fusion protein including a test protein covalently bonded to a binding moiety which is capable of specifically binding to the DNA-binding-protein recognition site; (b) introducing into the host cell a second fusion gene which expresses a second fusion protein, the second fusion protein including one of said population of proteins covalently bonded to a gene activating moiety and being conformationally-constrained; and (c) measuring expression of the reporter gene. Preferably, the population of proteins includes short peptides of between 1 and 60 amino acids in length.
The invention also features a method of detecting an interacting protein within a population wherein the population of proteins is a set of randomly generated or intentionally designed peptide sequences, or where the population of proteins is conformationally-constrained by covalently bonding to a conformation-constraining protein. Preferably, where the population of proteins is conformationally-constrained by covalent bonding to a conformation-constraining protein, the population of proteins is embedded within the conformation-constraining protein. The invention further features a method of detecting an interacting protein within a population wherein the conformation-constraining protein is thioredoxin. Preferably, the population of proteins is inserted into the active site loop of the thioredoxin.
The invention further features a method wherein each of the population of proteins is conformationally-constrained by disulfide bonds between cysteine residues in the amino-terminus and in the carboxy-terminus of said protein.
In preferred embodiments of various aspects, the host cell is yeast; the DNA binding domain is LexA; the interacting protein includes one or more loops; and/or the reporter gene is assayed by a color reaction or by cell viability.
In other embodiments the bait may be Cdk2 or a Ras protein sequence.
In another related aspect, the invention features a method of identifying a candidate interactor. The method includes (a) providing a reporter gene operably linked to a DNA-binding-protein recognition site; (b) providing a first fusion protein, which includes a first protein covalently bonded to a binding moiety which is capable of specifically binding to the DNA-binding-protein recognition site; (c) providing a second fusion protein, which includes a second protein covalently bonded to a gene activating moiety and being conformationally-constrained, the second protein being capable of interacting with said first protein; (d) contacting said candidate interactor with said first protein and/or said second protein; and (e) measuring expression of said reporter gene.
The invention features a method of identifying a candidate interactor wherein the first fusion protein is provided by providing a first fusion gene which expresses the first fusion protein and wherein the second fusion protein is provided by providing a second fusion gene which expresses said second fusion protein. Alternatively, the reporter gene, the first fusion gene, and the second fusion gene are included on a single piece of DNA.
The invention also features a method of identifying candidate interactors wherein the first fusion protein and the second fusion protein are permitted to interact prior to contact with said candidate interactor, and a related method wherein the first fusion protein and the candidate interactor are permitted to interact prior to contact with said second fusion protein.
In a preferred embodiment, the candidate interactor is conformationally-constrained and may include one or more loops. Where the candidate interactor is an antagonist, reporter gene expression is reduced. Where the candidate interactor is an agonist, reporter gene expression is increased. The candidate interactor is a member selected from the group consisting of proteins, polynucleotides, and small molecules. In addition, a candidate interactor can be encoded by a member of a cDNA or synthetic DNA library. Moreover, the candidate interactor can be a mutated form of said first fusion protein or said second fusion protein.
In a preferred embodiment of any of the above aspects, the candidate interactor is isolated in vitro and shown to function in vivo, i.e., as a conformationally constrained intracellular peptide.
In a related aspect, the invention features a population of eukaryotic cells, each cell having a recombinant DNA molecule encoding a conformationally-constrained intracellular peptide, there being at least 100 different recombinant molecules in the population, each molecule being in at least one cell of said population.
Preferably, the intracellular peptides within the population of cells are conformationally-constrained because they are covalently bonded to a conformation-constraining protein.
In preferred embodiments the intracellular peptide is embedded within the conformation-constraining protein, preferably thioredoxin; the intracellular peptide is conformationally-constrained by disulfide bonds between cysteine residues in the amino-terminus and in the carboxy-terminus of said second protein; the intracellular peptide includes one or more loops; the population of eukaryotic cells are yeast cells; the recombinant DNA molecule further encodes a gene activating moiety covalently bonded to said intracellular peptide; and/or the intracellular peptide physically interacts with a second recombinant protein inside said eukaryotic cells.
In another aspect, the invention features a method of assaying an interaction between a first protein and a second protein. The method includes: (a) providing a reporter gene operably linked to a DNA-binding-protein recognition site; (b) providing a first fusion protein including a first protein covalently bonded to a binding moiety which is capable of specifically binding to the DNA-binding-protein recognition site; (c) providing a second fusion protein including a second protein which is conformationally constrained (and may include one or more loops) and is covalently bonded to a gene activating moiety; (d) combining the reporter gene, the first fusion protein, and the second fusion protein; and (e) measuring expression of the reporter gene.
In a preferred embodiment, the invention further features a method of assaying the interaction between two proteins wherein the first fusion protein is provided by providing a first fusion gene which expresses the first fusion protein and wherein the second fusion protein is provided by providing a second fusion gene which expresses the second fusion protein. In another preferred embodiment, the interaction is assayed in vitro and shown to function in vivo, i.e., as a conformationally constrained intracellular peptide.
In yet other aspects, the invention features a protein including the sequence Leu-Val-Cys-Lys-Ser-Tyr-Arg-Leu-Asp-Trp-Glu-Ala-Gly-Ala-Leu-Phe-Arg-Ser-Leu-Phe (SEQ ID NO: 1), preferably conformationally-constrained; protein including the sequence Met-Val-Val-Ala-Ala-Glu-Ala-Val-Arg-Thr-Val-Leu-Leu-Ala-Asp-Gly-Gly-Asp-Val-Thr (SEQ ID NO: 2); preferably conformationally-constrained; a protein including the sequence Pro-Asn-Trp-Pro-His-Gln-Leu-Arg-Val-Gly-Arg-Val-Leu-Trp-Glu-Arg-Leu-Ser-Phe-Glu (SEQ ID NO: 3), preferably conformationally-constrained; a protein including the sequence Ser-Val-Arg-Met-Arg-Tyr-Gly-Ile-Asp-Ala-Phe-Phe-Asp-Leu-Gly-Gly-Leu-Leu-His-Gly (SEQ ID NO: 9), preferably conformationally-constrained; a protein including the sequence Glu-Leu-Arg-His-Arg-Leu-Gly-Arg-Ala-Leu-Ser-Glu-Asp-Met-Val-Arg-Gly-Leu-Ala-Trp-Gly-Pro-Thr-Ser-His-Cys-Ala-Thr-Val-Pro-Gly-Thr-Ser-Asp-Leu-Trp-Arg-Val-Ile-Arg-Phe-Leu (SEQ ID NO: 10), preferably conformationally-constrained; a protein including the sequence Tyr-Ser-Phe-Val-His-His-Gly-Phe-Phe-Asn-Phe-Arg-Val-Ser-Trp-Arg-Glu-Met-Leu-Ala (SEQ ID NO: 11), preferably conformationally-constrained; a protein including the sequence Gln-Val-Trp-Ser-Leu-Trp-Ala-Leu-Gly-Trp-Arg-Trp-Leu-Arg-Arg-Tyr-Gly-Trp-Asn-Met (SEQ ID NO: 12), preferably conformationally-constrained; a protein including the sequence Trp-Arg-Arg-Met-Glu-Leu-Asp-Ala-Glu-Ile-Arg-Trp-Val-Lys-Pro-Ile-Ser-Pro-Leu-Glu (SEQ ID NO: 13), preferably conformationally-constrained; a protein including the sequence Trp-Ala-Glu-Trp-Cys-Gly-Pro-Val-Cys-Ala-His-Gly-Ser-Arg-Ser-Leu-Thr-Leu-Leu-Thr-Lys-Tyr-His-Val-Ser-Phe-Leu-Gly-Pro-Cys-Lys-Met-Ile-Ala-Pro-Ile-Leu-Asp (SEQ ID NO:17), preferably conformationally-constrained; a protein including the sequence Leu-Val-Cys-Lys-Ser-Tyr-Arg-Leu-Asp-Trp-Glu-Ala-Gly-Ala-Leu-Phe-Arg-Ser-Leu-Phe (SEQ ID NO: 18), preferably conformationally-constrained; a protein including the sequence Tyr-Arg-Trp-Gln-Gln-Gly-Val-Val-Pro-Ser-Asn-Trp-Ala-Ser-Cys-Ser-Phe-Arg-Cys-Gly (SEQ ID NO: 19), preferably conformationally-constrained; a protein including the sequence Ser-Ser-Phe-Ser-Leu-Trp-Leu-Leu-Met-Val-Lys-Ser-Ile-Lys-Arg-Ala-Ala-Trp-Glu-Leu-Gly-Pro-Ser-Ser-Ala-Trp-Asn-Thr-Ser-Gly-Trp-Ala-Ser-Leu-Ala-Asp-Phe-Tyr (SEQ ID NO: 20) preferably conformationally-constrained; a protein including the sequence Arg-Val-Lys-Leu-Gly-Tyr-Ser-Phe-Trp-Ala-Gln-Ser-Leu-Leu-Arg-Cys-Ile-Ser-Val-Gly (SEQ ID NO: 21), preferably conformationally-constrained; a protein including the sequence Gln-Leu-Tyr-Ala-Gly-Cys-Tyr-Leu-Gly-Val-Val-Ile-Ala-Ser-Ser-Leu-Ser-Ile-Arg-Val (SEQ ID NO: 22), preferably conformationally-constrained; a protein including the sequence Gln-Gln-Arg-Phe-Val-Phe-Ser-Pro-Ser-Trp-Phe-Thr-Cys-Ala-Gly-Thr-Ser-Asp-Phe-Trp-Gly-Pro-Glu-Pro-Leu-Phe-Asp-Trp-Thr-Arg-Asp (SEQ ID NO: 23), preferably conformationally-constrained; a protein including the sequence Arg-Pro-Leu-Thr-Gly-Arg-Trp-Val-Val-Trp-Gly-Arg-Arg-His-Glu-Glu-Cys-Gly-Leu-Thr (SEQ ID NO: 24), preferably conformationally-constrained; a protein including the sequence Pro-Val-Cys-Cys-Met-Met-Tyr-Gly-His-Arg-Thr-Ala-Pro-His-Ser-Val-Phe-Asn-Val-Asp (SEQ ID NO: 25), preferably conformationally-constrained; a protein including the sequence Trp-Ser-Pro-Glu-Leu-Leu-Arg-Ala-Met-Val-Ala-Phe-Arg-Trp-Leu-Leu-Glu-Arg-Arg-Pro (SEQ ID NO: 26); and substantially pure DNA encoding the immediately foregoing proteins.
The invention also includes novel proteins and other candidate interactors identified by the foregoing methods. It will be appreciated that these proteins and candidate interactors may either increase or decrease reporter gene activity and that these changes in activity may be measured using assays described herein or known in the art. Also included in the invention are methods for using conformationally constrained interactor proteins. For example, the conformationally constrained proteins of the invention may be used as reagents in assays for protein detection that involve formation of a complex between the conformationally constrained protein and a protein of interest to which it specifically binds, followed by complex detection (for example, by an immunoprecipitation, Western blot, or affinity column technique that utilizes the conformationally constrained protein as the complex-forming reagent).
Finally, the invention features a method of assaying an interaction between a first protein and a second protein, involving: (a) providing the first protein; (b) providing a fusion protein including the second protein, the second protein being conformationally-constrained; (c) contacting the first protein with the fusion protein under conditions which allow complex formation; (d) detecting the complex as an indication of an interaction; and (e) determining whether the first protein interacts with the fusion protein inside a cell.
As used herein, by xe2x80x9creporter genexe2x80x9d is meant a gene whose expression may be assayed; such genes include, without limitation, lacZ, amino acid biosynthetic genes, e.g. the yeast LEU2, HIS3, LYS2, TRP1, or URA3 genes, nucleic acid biosynthetic genes, the mammalian chloramphenicol transacetylase (CAT) gene, or any surface antigen gene for which specific antibodies are available. Reporter genes may encode any protein that provides a phenotypic marker, for example, a protein that is necessary for cell growth or a toxic protein leading to cell death, or may encode a protein detectable by a color assay leading to the presence or absence of color (e.g., florescent proteins and derivatives thereof). Alternatively, a reporter gene may encode a suppressor tRNA, the expression of which produces a phenotype that can be assayed. A reporter gene according to the invention includes elements (e.g., all promoter elements) necessary for reporter gene function.
By xe2x80x9coperably linkedxe2x80x9d is meant that a gene and a regulatory sequence(s) are connected in such a way as to permit gene expression when the appropriate molecules (e.g., transcriptional activator proteins or proteins which include transcriptional activation domains) are bound to the regulatory sequence(s).
By xe2x80x9ccovalently bondedxe2x80x9d is meant that two domains are joined by covalent bonds, directly or indirectly. That is, the xe2x80x9ccovalently bondedxe2x80x9d proteins or protein moieties may be immediately contiguous or may be separated by stretches of one or more amino acids within the same fusion protein.
By xe2x80x9cprovidingxe2x80x9d is meant introducing the fusion proteins into the interaction system sequentially or simultaneously, and directly (as proteins) or indirectly (as genes encoding those proteins).
By xe2x80x9cproteinxe2x80x9d is meant a sequence of amino acids of any length, constituting all or a part of a naturally-occurring polypeptide or peptide, or constituting a non-naturally-occurring polypeptide or peptide (e.g., a randomly generated peptide sequence or one of an intentionally designed collection of peptide sequences).
By a xe2x80x9cbinding moietyxe2x80x9d is meant a stretch of amino acids which is capable of directing specific polypeptide binding to a particular DNA sequence (i.e., a xe2x80x9cDNA-binding-protein recognition sitexe2x80x9d).
By xe2x80x9cweak gene activating moietyxe2x80x9d is meant a stretch of amino acids which is capable of weakly inducing the expression of a gene to whose control region it is bound. As used herein, xe2x80x9cweaklyxe2x80x9d is meant below the level of activation effected by GAL4 activation region II (Ma and Ptashne, Cell 48:847, 1987) and is preferably at or below the level of activation effected by the B112 activation domain of Ma and Ptashne (Cell 51:113, 1987). Levels of activation may be measured using any downstream reporter gene system and comparing, in parallel assays, the level of expression stimulated by the GAL4 region II-polypeptide with the level of expression stimulated by the polypeptide to be tested.
By xe2x80x9caltering the expression of the reporter genexe2x80x9d is meant an increase or decrease in the expression of the reporter gene to the extent required for detection of a change in the assay being employed. It will be appreciated that the degree of change will vary depending upon the type of reporter gene construct or reporter gene expression assay being employed.
By xe2x80x9cconformationally-constrainedxe2x80x9d is meant a protein that has reduced structural flexibility because its amino and carboxy termini are fixed in space. As a result of this constraint, the protein may form xe2x80x9cloopsxe2x80x9d (i.e., regions of amino acids of any shape which extend away from the constrained amino and carboxy termini). Preferably, the conformationally-constrained protein is displayed in a structurally rigid manner. Conformational constraint according to the invention may be brought about by exploiting the disulfide-bonding ability of a natural or recombinantly-introduced pair of cysteine residues, one residing at or near the amino-terminal end of the protein of interest and the other at or near the carboxy-terminal end. Alternatively, conformational constraint may be facilitated by embedding the protein of interest within a conformation-constraining protein.
By xe2x80x9cconformation-constraining proteinxe2x80x9d is meant any peptide or polypeptide which is capable of reducing the flexibility of another protein""s amino and/or carboxy termini. Preferably, such proteins provide a rigid scaffold or platform for the protein of interest. In addition, such proteins preferably are capable of providing protection from proteolytic degradation and the like, and/or are capable of enhancing solubility. Examples of conformation-constraining proteins include thioredoxin and other thioredoxin-like proteins, nucleases (e.g., RNase A), proteases (e.g., trypsin), protease inhibitors (e.g., bovine pancreatic trypsin inhibitor), antibodies or structurally-rigid fragments thereof, conotoxins, and the pleckstrin homology domain. A conformation-constraining peptide can be of any appropriate length and can even be a single amino acid residue.
xe2x80x9cThioredoxin-like proteinsxe2x80x9d are defined herein as amino acid sequences substantially similar, e.g., having at least 18% homology, with the amino acid sequence of E. coli thioredoxin over an amino acid sequence length of 80 amino acids. Alternatively, a thioredoxin-like DNA sequence is defined herein as a DNA sequence encoding a protein or fragment of a protein characterized by having a three dimensional structure substantially similar to that of human or E. coli thioredoxin, e.g., glutaredoxin and optionally by containing an active-site loop. The DNA sequence of glutaredoxin is an example of a thioredoxin-like DNA sequence which encodes a protein that exhibits such substantial similarity in three-dimensional conformation and contains a Cys . . . Cys active-site loop. The amino acid sequence of E. coli thioredoxin is described in Eklund et al., EMBO J. 3:1443-1449 (1984). The three-dimensional structure of E. coli thioredoxin is depicted in FIG. 2 of Holmgren, J. Biol. Chem. 264:13963-13966 (1989). A DNA sequence encoding the E. coli thioredoxin protein is set forth in Lim et al., J. Bacteriol., 163:311-316 (1985). The three dimensional structure of human thioredoxin is described in Forman-Kay et al., Biochemistry 30:2685-98 (1991). A comparison of the three dimensional structures of E. coli thioredoxin and glutaredoxin is published in Xia, Protein Science I:310-321 (1992). These four publications are incorporated herein by reference for the purpose of providing information on thioredoxin-like proteins that is known to one of skill in the art. Examples of thioredoxin-like proteins are described herein.
By xe2x80x9ccandidate interactorsxe2x80x9d is meant proteins (xe2x80x9ccandidate interacting proteinsxe2x80x9d) or compounds which physically interact with a protein of interest; this term also encompasses agonists and antagonists. Agonist interactors are identified as compounds or proteins that have the ability to increase reporter gene expression mediated by a pair of interacting proteins. Antagonist interactors are identified as compounds or proteins that have the ability to decrease reporter gene expression mediated by a pair of interacting proteins. Candidate interactors also include so-called peptide xe2x80x9captamersxe2x80x9d which specifically recognize target proteins and may be used in a manner analogous to antibody reagents; such aptamers may include one or more loops.
xe2x80x9cCompoundsxe2x80x9d include small molecules, generally under 1000 MW, carbohydrates, polynucleotides, lipids, and the like.
By xe2x80x9ctest proteinxe2x80x9d is meant one of a pair of interacting proteins, the other member of the pair generally referred to as a xe2x80x9ccandidate interactorxe2x80x9d (supra).
By xe2x80x9crandomly generatedxe2x80x9d is meant sequences having no predetermined sequence; this is contrasted with xe2x80x9cintentionally designedxe2x80x9d sequences which have a DNA or protein sequence or motif determined prior to their synthesis.
By xe2x80x9cmutatedxe2x80x9d is meant altered in sequence, either by site-directed or random mutagenesis. A mutated form of a protein encompasses point mutations as well as insertions, deletions, or rearrangements.
By xe2x80x9cintracellularxe2x80x9d is meant that the peptide is localized inside the cell, rather than on the cell surface.
By an xe2x80x9cactivated Rasxe2x80x9d is meant any mutated form of Ras which remains bound to GTP for a period of time longer than that exhibited by the corresponding wild-type form of the protein. By xe2x80x9cRasxe2x80x9d is meant any form of Ras protein including, without limitation, N-ras, K-ras, and H-ras.
The interaction trap systems described herein provide advantages over more conventional methods for isolating interacting proteins or genes encoding interacting proteins. For example, applicants"" systems provide rapid and inexpensive methods having very general utility for identifying and purifying genes encoding a wide range of useful proteins based on the protein""s physical interaction with a second polypeptide. This general utility derives in part from the fact that the components of the systems can be readily modified to facilitate detection of protein interactions of widely varying affinity (e.g., by using reporter genes which differ quantitatively in their sensitivity to a protein interaction). The inducible nature of the promoter used to express the interacting proteins also increases the scope of candidate interactors which may be detected since even proteins whose chronic expression is toxic to the host cell may be isolated simply by inducing a short burst of the protein""s expression and testing for its ability to interact and stimulate expression of a reporter gene.
If desired, detection of interacting proteins may be accomplished through the use of weak gene activation domain tags. This approach avoids restrictions on the pool of available candidate interacting proteins which may be associated with stronger activation domains (such as GAL4 or VP16); although the mechanism is unclear, such a restriction apparently results from low to moderate levels of host cell toxicity mediated by the strong activation domain.
In addition, the claimed methods make use of conformationally-constrained proteins (i.e., proteins with reduced flexibility due to constraints at their amino and carboxy termini). Conformational constraint may be brought about by embedding the protein of interest within a conformation-constraining protein (i.e., a protein of appropriate length and amino acid composition to be capable of locking the candidate interacting protein into a particular three-dimensional structure). Examples of conformation-constraining proteins include, but are not limited to, thioredoxin (or other thioredoxin-like proteins), nucleases (e.g., RNase A), proteases (e.g., trypsin), protease inhibitors (e.g., bovine pancreatic trypsin inhibitor), antibodies or structurally-rigid fragments thereof, conotoxins, and the pleckstrin homology domain.
Alternatively, conformational constraint may be accomplished by exploiting the disulfide-bonding ability of a natural or recombinantly-introduced pair of cysteine residues, one residing at the amino terminus of the protein of interest and the other at its carboxy terminus. Such disulfide bonding locks the protein into a rigid and therefore conformationally-constrained loop structure. Disulfide bonds between amino-terminal and carboxy-terminal cysteines may be formed, for example, in the cytoplasm of E. coli trxB mutant strains. Under some conditions disulfide bonds may also form within the cytoplasm and nucleus of higher organisms harboring equivalent mutations, for example, an S. cerevisiae YTR4xe2x88x92 mutant strain (Furter et al., Nucl Acids Res. 14:6357-6373, 1986; GenBank Accession Number P29509). In addition, the thioredoxin fusions described herein (trxA fusions) are amenable to this alternative means of introducing conformational constraint, since the cysteines at the base of peptides inserted within the thioredoxin active-site loop are at a proper distance from one another to form disulfide bonds under appropriate conditions.
Conformationally-constrained proteins as candidate interactors are useful in the invention because they are amenable to tertiary structural analysis, thus facilitating the design of simple organic molecule mimetics with improved pharmacological properties. For example, because thioredoxin has a known structure, the protein structure between the conformationally constrained regions may be more easily solved using methods such as NMR and X-ray difference analysis. Certain conformation-constraining proteins also protect the embedded protein from cellular degradation and/or increase the protein""s solubility, and/or otherwise alter the capacity of the candidate interactor to interact.
Once isolated, interacting proteins can also be analyzed using the interaction trap system, with the signal generated by the interaction being an indication of any change in the proteins"" interaction capabilities. In one particular example, an alteration is made (e.g., by standard in vivo or in vitro directed or random mutagenesis procedures) to one or both of the interacting proteins, and the effect of the alteration(s) is monitored by measuring reporter gene expression. Using this technique, interacting proteins with increased or decreased interaction potential are isolated. Such proteins are useful as therapeutic molecules (for example, agonists or antagonists) or, as described above, as models for the design of simple organic molecule mimetics.
Protein agonists and antagonists may also be readily identified and isolated using a variation of the interaction trap system. In particular, once a protein-protein interaction has been recorded, an additional DNA coding for a candidate agonist or antagonist, or preferably, one of a library of potential agonist- or antagonist-encoding sequences is introduced into the host cell, and reporter gene expression is measured. Alternatively, candidate interactor agonist or antagonist compounds (i.e., including polypeptides as well as non-proteinaceous compounds, e.g., single stranded polynucleotides) are introduced into an in vivo or in vitro interaction trap system according to the invention and their ability to effect reporter gene expression is measured. A decrease in reporter gene expression (compared to a control lacking the candidate sequence or compound) indicates an antagonist. Conversely, an increase in reporter gene expression (compared again to a control) indicates an agonist. Interaction agonists and antagonists are useful as therapeutic agents or as models to design simple mimetics; if desired, an agonist or antagonist protein may be conformationally-constrained to provide the advantages described herein. Particular examples of interacting proteins for which antagonists or agonists may be identified include, but are not limited to, the IL-6 receptor-ligand pair, TGF-xcex2 receptor-ligand pair, IL-1 receptor-ligand pair and other receptor-ligand interactions, protein kinase-substrate pairs, interacting pairs of transcription factors, interacting components of signal transduction pathways (for example, cytoplasmic domains of certain receptors and G-proteins), pairs of interacting proteins involved in cell cycle regulation (for example, p16 and CDK4), and neurotransmitter pairs.
Also included in the present invention are libraries encoding conformationally-constrained proteins. Such libraries (which may include natural as well as synthetic DNA sequence collections) are expressed intracellularly or, optionally, in cell-free systems, and may be used together with any standard genetic selection or screen or with any of a number of interaction trap formats for the identification of interacting proteins, agonist or antagonist proteins, or proteins that endow a cell with any identifiable characteristic, for example, proteins that perturb cell cycle progression. Accordingly, peptide-encoding libraries (either random or designed) can be used in selections or screens which either are or are not transcriptionally-based. These libraries (which preferably include at least 100 different peptide-encoding species and more preferably include 1000, or 100,000 or greater individual species) may be transformed into any useful prokaryotic or eukaryotic host, with yeast representing the preferred host. Alternatively, such peptide-encoding libraries may be expressed in cell-free systems.
Other features and advantages of the invention will be apparent from the following detailed description thereof, and from the claims.