The present invention is related to the reassembly of fusion peptides into a functionally active protein complex. Specifically, the present invention provides a method of forming peptide complexes that associate through the combination of helical domains to form an antiparallel leucine zipper. The present invention is also related to the use of assays to investigate protein-protein interactions. The assays of the present invention involve the association of fusion proteins comprising GFP fragments and heterologous polypeptides into functionally active GFP that exhibits fluorescence.
All publications and patent applications herein are incorporated by reference to the same extent as if each individual publication or patent application was specifically and individually indicated to be incorporated by reference.
Green Fluorescent Protein
Green fluorescent protein (GFP), a relatively small protein comprising 238 amino acids, is the ultimate source of fluorescent light emission in the jellyfish Aequorea victoria. The gene for GFP was first cloned by Prasher et al. (1992, Gene, 111:229-233), and cDNA for the protein produces a fluorescent product identical to that of native protein when expressed in prokaryotic (E. coli) and eucaryotic (C. elegans) cells (Chalfie et al., 1994, Science, 263, 802-805).
The GFP excitation spectrum shows an absorption band (blue light) maximally at 395 nm with a minor peak at 470 nm, and an emission peak (green light) at 509 nm. The longer-wavelength excitation peak has greater photostability than the shorter peak, but is relatively low in amplitude (Chalfie et al., 1994, Science, 263: 802-805). The crystal structure of the protein and of several point mutants has been solved (Ormo et al., 1996, Science 273, 1392; Yang et al., Nature Biotechnol. 14, 1246). The fluorophore, consisting of a tripeptide at residues 65-67, is buried inside a relatively rigid beta-can structure, where it is almost completely protected from solvent access. The GFP absorption bands and emission peak arise from an internal p-hydroxybenzylideneimidazolidinone chromophore, which is generated by cyclization and oxidation of the tripeptide sequence Ser-Tyr-Gly sequence at residues 65-67 (Cody et al., 1993, Biochemistry 32: 1212-1218).
GFP fluorescence in procaryotic and eucaryotic cells does not require exogenous substrates and cofactors. Accordingly, GFP is considered to have tremendous potential in methods to monitor gene expression, cell development, or as an in situ tag for fusion proteins (Heim et al., 1994, P.N.A.S. USA, 91,12501-12504). Chalfie and Prasher, WO 95/07463 (Mar. 16, 1995), describe various uses of GFP, including a method of examining gene expression and protein localization in living cells. Methods are described wherein: 1) a DNA molecule is introduced into a cell, said DNA molecule having DNA sequence of a particular gene linked to DNA sequence encoding GFP such that the regulatory element of the gene will control expression of GFP; 2) the cell is cultured in conditions permitting the expression of the fused protein; and 3) detection of expression of GFP in the cell, thereby indicating the expression of the gene in the cell. Methods such as those described by Chalfie and Prasher are advantageous compared to previously reported methods which utilized xcex2-galactosidase fusion proteins (Silhavy and Beckwith, 1985, Microbiol. Rev., 49, 398; Gould and Subramani, 1988, Anal. Biochem., 175, 5; Stewart and Williams, 1992, J. Gen. Microbiol., 138,1289) or luciferases, in that the need to fix cell preparations and/or add exogenous substrates and cofactors is eliminated.
GFP is a valuable marker for intracellular protein localization. However, the fusion of GFP with structural proteins can alter their properties, resulting in loss of fusion protein localization, decreased GFP fluorescence or both. The fluorescence of this protein is sensitive to a number of point mutations (Phillips, G. N., 1997, Curr. Opin. Struct. Biol. 7, 821-27). The fluorescence appears to be a sensitive indication of the preservation of the native structure of the protein, since any disruption of the structure allowing solvent access to the fluorophoric tripeptide will quench the fluorescence. Abedi et al. (1998, Nucleic Acids Res., 26, 623-30) have inserted peptides between residues contained in several GFP loops. Inserts of the short sequence LEEFGS (SEQ ID NO: 9) between adjacent residues at 10 internal insertion sites were tried. Of these, inserts at three sites, between residues 157-158, 172-173 and 194-195 gave fluorescence of at least 1% of that of wild type GFP. Only inserts between residues 157-158 and 172-173 had fluorescence of at least 10% of wild type GFP.
Protein Reassembly using Leucine Zipper
The unassisted reconstitution of proteins from peptide fragments has been demonstrated for several proteins; including ribonuclease (Richards et al., 1959, J. Biol. Chem. 234, 1459-1465), chymotrypsin inhibitor-2 (Gay et al., 1994, Biochemistry, 33, 7957-7963), tRNA synthetases (Shiba et al., 1992, Proc. Natl. Acad. Sci. U.S.A., 89, 1880-1884), and inteins (Southworth, et al., 1998, EMBO J., 17, 918-926). Protein reassembly has thus become an important avenue for understanding enzyme catalysis (Richards et al., 1959, J. Biol. Chem. 234, 1459-1465), protein folding (Gay et al., 1994, Biochemistry, 33, 7957-7963), and protein evolution (Shiba et al., 1992, Proc. Natl. Acad. Sci. U.S.A., 89, 1880-1884). Recently, assisted protein reassembly or xe2x80x9cfragment complementationxe2x80x9d has been applied to the in vivo detection of protein-protein interactions in such systems as dihydrofolate reductase (DHFR) (Pelletier et al., 1998, Proc. Natl. Acad. Sci. U.S.A., 95, 12141-12146; Remy et al., 1999, Proc. Natl. Acad. Sci. U.S.A., 96, 5394-5399; Pelletier et al., 1999, Nat. Biotechnol., 17, 683-690), ubiquitin (Karimova et al., 1998, Proc. Natl. Acad. Sci. U.S.A., 95, 5752-5756; Johnsson et al., 1994, Proc. Natl. Acad. Sci. U.S.A., 91, 10340-10344), and xcex2-galactosidase (Rossi et al., 1997, Proc. Natl. Acad. Sci. U.S.A., 94, 8405-8410). These reassembly processes are contingent upon the proper choice of a dissection site within a protein and can be aided by techniques such as limited proteolysis, circular permutation (Baird et al., 1999, Proc. Natl. Acad. Sci. U.S.A., 96, 11241-11246; Topell et al., 1999, FEBS Lett., 457, 283-289; Zhang et al., 1993, Biochemistry, 32, 12311-12318; Regan, L., 1999, Curr. Opin. Struc. Biol., 9, 494-499) and loop insertions (Abedi et al., 1998, Nucleic Acid Res., 26, 623-630; Nobuhide et al., 1999, FEBS Lett., 453, 305-307).
The dissection and subsequent reassembly of a protein from peptidic fragments provide an avenue for controlling its tertiary structure and hence its function. Although a majority of leucine zippers associate in a parallel fashion, recent examples of both naturally occurring and designed antiparallel leucine zippers have appeared in the literature (Lupas, A., 1996, Trends Biochem. Sc. 21, 375-382; Kohn, W. D. et al., 1997, S. J. Biol. Chem. 272, 2583-2586; Bryson, J. W. et al., 1995, Science, 270, 935-941; Oakley M. G. et al., 1998, Biochemistry, 37, 12603-12610, Oakley, M. G. et al., 1997, Biochemistry, 36, 2544-2548). However, the prior art does not disclose the attachment of antiparallel leucine zippers to polypeptide fragments to form fusion proteins for reassembling the polypeptide fragments into functional proteins.
In contrast to parallel zippers, the antiparallel zippers are oriented in an opposite direction. Antiparallel Zippers have the advantage of occurring less frequently in natural proteins. Thus, antiparallel leucine zippers will interfere to a lesser extent with natural cellular proteins than parallel leucine zippers. Antiparallel attachment of leucine zippers to protein fragments (between a dissected peptide bond of the parent protein) requires a shorter amino acid linker region. As shown by the inventors of the present invention, as a preferred embodiment, a linker having 4-6 amino acids is sufficient (see Examples). Similar attachment of parallel leucine zippers would require  greater than 10 amino acids to span the necessary distance. The long unstructured linkers would be prone to proteolytic cleavage and be less stable in in vivo assays.
Katz et al. (1998, Biotechniques, 25, 298) describe a targeting approach based on noncovalent heterodimerization of GFP and cytoplasmic structural proteins using a leucine zipper designed to form high-affinity heterodimers. The complexes localized accurately to specific sites within cells, providing selective fluorescence labeling of subcellular structures such as microfilaments or focal contacts.
Protein-Protein Interaction Assays
The association and dissociation of proteins are crucial to all aspects of cell function. Examples of protein-protein interactions are evident in hormones and their respective receptors, in intracellular and extracellular signalling events mediated by proteins, in enzyme substrate interactions, in intracellular protein trafficking, in the formation of complex structures like ribosomes, viral coat proteins, and filaments, and in antigen-antibody interactions. Intracellular assays for detection of protein interactions and identification of their inhibitors have received wide attention with the completion of the human genome sequence.
U.S. Pat. No. 5,585,245 discloses a first fusion protein comprising an N-terminal subdomain of ubiquitin, fused to a non-ubiquitin protein or peptide and a second fusion protein comprising a C-terminal subdomain of ubiquitin, fused to the N-terminus of a non-ubiquitin protein or peptide. The patent discloses the use of these fusion proteins for studying protein-protein interactions. When contacted with one another, provided that the non-ubiquitin proteins or peptides interact (bind) with one another, the N- and C-terminal ubiquitin subdomains associate to reconstitute a quasi-native ubiquitin moiety which is recognized and cleaved by ubiquitin-specific proteases. However, this assay requires the use of additional cellular factors, such as the ubiquitin-specific proteases, for detection of protein-protein interaction. Thus, this assay is not feasible for high throughput screening of cDNA libraries.
U.S. Pat. No. 5,362,625 discloses omega-acceptor and omega-donor polypeptides (comprising about two-thirds and one-third of the xcex2-galactosidase molecule amino and carboxyl termini, respectively), prepared by recombinant DNA techniques, DNA synthesis, or chemical polypeptide synthesis techniques, which are capable of interacting to form an active enzyme complex having catalytic activity characteristic of xcex2-galactosidase. The patent also describes the use of these polypeptides in enzyme complementation assays for qualitative and quantitative determination of a suspected analyte in a sample.
The yeast two-hybrid system for detecting protein-protein interactions in Saccharomyces cerevisiae (Fields and Song, 1989, Nature, 340:245-246; U.S. Pat. No. 5,283,173 by Fields and Song) is well known in the art. This assay utilizes the reconstitution of a transcriptional activator like GAL4 (Johnston, 1987, Microbiol. Rev., 51:458-476) through the interaction of two protein domains that have been fused to the two functional units of the transcriptional activator: the DNA-binding domain and the activation domain. This is possible due to the bipartite nature of certain transcription factors like GAL4. Being characterized as bipartite signifies that the DNA-binding and activation functions reside in separate domains and can function in trans (Keegan et al., 1986, Science 231:699-704). The reconstitution of the transcriptional activator is monitored by the activation of a reporter gene like the lacZ gene that is under the influence of a promoter that contains a binding site (Upstream Activating Sequence or UAS) for the DNA-binding domain of the transcriptional activator. This method is most commonly used either to detect an interaction between two known proteins (Fields and Song, 1989, Nature, 340:245-246) or to identify interacting proteins from a population that would bind to a known protein (Durfee et al., 1993, Genes Dev., 7:555-569; Gyuris et al., 1993, Cell, 75:791-803; Harper et al., 1993, Cell, 75:805-816; Vojtek et al., 1993, Cell, 74:205-214). Like the ubiquitin system, additional factors are required for detection of the protein-protein interaction. Additionally, in the yeast two-hybrid system, the protein interaction must occur in the nucleus of the yeast.
WO 98/34120 describes protein fragment complementation assays for detecting bimolecular interactions. The assays comprise coexpression of fusion peptides consisting of N- and C-terminal fragments of murine DHFR fused to GCN4 leucine zipper sequences in E. coli to form colony. Colony formation only occurs when both DHFR fragments are present and contain leucine-zipper forming sequences. The published patent application contemplates the use of the assay to study molecular interactions including protein-protein, protein-DNA, protein-RNA, protein-carbohydrate, and protein-small molecule interactions, and for screening cDNA libraries for binding of a target protein with unknown proteins or libraries of small organic molecules for biological activity. WO 98/34120 also contemplates the use of GFP in the protein fragment complementation assay. However, the published patent application does not suggest fusing antiparallel leucine zipper to DHFR or GFP for reconstitution. GCN4 disclosed in the published application and routinely used by skilled artisan to reassemble proteins especially in the yeast two hybrid system, is a parallel zipper. Antiparallel and parallel zippers orient proteins in opposite direction; thus, it is not predictable that an antiparallel zipper can be substituted for a parallel zipper.
Additionally, all protein reassembly strategies disclosed in WO 98/34120 are for reassembly of multi domain proteins such as DHFR. The two dissected domains of DHFR can fold separately and only need to be brought into close proximity by attached proteins. There is no precedent for rational dissection of a single domain protein such as GFP that can be accomplished based upon the WO 98/34120. WO 98/34120 does not teach how to rationally dissect single domain proteins that can be subsequently reassembled. Finally, the ability to identify and characterize appropriate sites for dissecting a single domain protein is not validated or demonstrated in WO 98/34120.
U.S. Pat. No. 6,180,343 relates to the use of fluorescent proteins, particularly green fluorescent protein (GFP), in fusion constructs with random and defined peptides and peptide libraries, to increase the cellular expression levels, decrease the cellular catabolism, increase the conformational stability relative to linear peptides, and to increase the steady state concentrations of the random peptides and random peptide library members expressed in cells for the purpose of detecting the presence of the peptides and screening random peptide libraries. The patent does not contemplate the use of antiparallel leucine zipper for reconstituting GFP nor the use of peptides that associate with each other to reconstitute GFP and to provide a detection signal.
The present invention provides protein complexes comprising a first and second peptide, each of said peptides being joined, operably linked, or fused to a heterologous helical domain, said helical domains being noncovalently associated to form an antiparallel leucine zipper. The peptides of the protein complexes form a functional signaling moiety such as a reporter, a marker, or a biosensor upon non-covalent association of the helical domains into an antiparallel leucine zipper. In one embodiment, each of the peptides is joined to a helical domain via a linker. In a preferred embodiment, each of the helical domains comprises an amino acid sequence as set forth in SEQ ID NO: 1 or SEQ ID NO: 2. Preferably, each of the first and second peptides comprises a distinct portion of green fluorescent protein (GFP).
In one aspect, the present invention provides fusion proteins comprising a peptide and a helical domain, said helical domain forming an antiparallel leucine zipper when it noncovalently associates with a complementary helical domain. The helical domain is a heterologous or distinct protein or polypeptide fragment, relative to the peptide of the fusion protein. The fusion protein may further comprise a linker moiety interposed between the peptide and the helical domain. In a preferred embodiment, the peptide comprises a peptide derived from green fluorescent protein (GFP).
In another aspect, the present invention provides nucleic acids encoding fusion proteins comprising a peptide and helical domain, said helical domain forming an antiparallel leucine zipper when it noncovalently associates with a complementary helical domain.
The present invention provides a method of assembling a protein complex comprising (a) providing first and second helical domains that non-covalently associate to form an antiparallel leucine zipper; (b) providing first and second peptides; (c) producing fusion proteins by separately fusing said first helical domain to said first peptide and said second helical domain to said second peptide; and, (d) allowing the fusion proteins to form a protein complex mediated by the non-covalent association of the first and second helical domains into an antiparallel leucine zipper. The first and second peptides are distinct peptides. Preferably, they are distinct peptides derived from GFP, such that they comprise different GFP fragments.
In one embodiment of the disclosed method of assembling a protein complex, the protein complex comprises a signaling moiety and the helical domains comprise a leucine rich hydrophobic core. The helical domains may further comprise acidic residues and basic residues. The helical domains may further comprise a buried asparagine residue. The pair of helical domains preferably have the amino acid sequences as set forth in SEQ ID NO: 1 and SEQ ID NO: 2. In an alternative embodiment of the method, the step of producing the fusion proteins further comprises interposing a linker moiety between the peptide and the helical domain.
The present invention also provides a method of identifying a polypeptide that interacts with a known polypeptide comprising (a) producing a first fusion protein comprising the known polypeptide linked to a first GFP fragment; (b) producing a second fusion protein comprising a test polypeptide linked to a second GFP fragment, wherein association of the first and second GFP fragments results in a GFP that exhibits detectable fluorescence; (c) allowing the first fusion protein to associate with the second fusion protein to form a complex mediated by the non-covalent association of the known polypeptide and test polypeptide; and, (d) detecting whether, or to what extent, association of first and second GFP fragments occcurs, wherein association of GFP indicates that the test polypeptide interacts with the known polypeptide. Preferably, the first GFP peptide is NGFP and the second GFP peptide is CGFP.
In one aspect, the present invention provides a method of identifying a polypeptide that interacts with a known polypeptide comprising (a) producing a nucleic acid encoding a fusion protein comprising the known polypeptide linked to a first GFP fragment; (b) producing a plurality of nucleic acids encoding fusion proteins comprising a test polypeptide linked to a second GFP fragment, wherein association of the first and second GFP fragments results in a GFP that exhibits detectable fluorescence; (c) cotransforming or cotransfecting the nucleic acids of steps (a) and (b) into a host cell for expression of the encoded fusion proteins; (d) selecting colonies that exhibit fluorescence; and, (e) culturing the selected colonies to identify the test polypeptides that interact with the known polypeptide.
In a preferred embodiment of the constructs and methods of the present invention, the first GFP peptide is NGFP and the second GFP peptide is CGFP. Also, preferably, the nucleic acids of step (b) of the foregoing identification step are produced in the form of a combinatorial library.
In another aspect, the present invention provides a method of identifying a molecule that inhibits the activity of a known protein comprising (a) producing a first fusion protein comprising a first known polypeptide linked to a first GFP fragment; (b) producing a second fusion protein comprising a second polypeptide linked to a second GFP fragment, wherein the second polypeptide is known to interact with the first polypeptide and wherein association of the first and second GFP fragments results in a GFP that exhibits detectable fluorescence; (c) allowing the first fusion protein to associate with the second fusion protein to form a GFP complex mediated by the non-covalent association of the first and second polypeptide; (d) incubating a test molecule with the GFP complex; and, (e) detecting disassembly of the complex, wherein disassembly of the complex indicates that the test molecule inhibits the activity of the known protein. Preferably, the first GFP peptide is NGFP and the second GFP peptide is CGFP.
The present invention also contemplates a method of detecting protein-protein interactions comprising (a) producing a first fusion protein comprising a known polypeptide linked to a first GFP fragment; (b) producing a second fusion protein comprising a test polypeptide linked to a second GFP fragment, wherein association of the first and second GFP fragments results in a GFP that exhibits detectable fluorescence; (c) allowing the first fusion protein to associate with the second fusion protein to form a complex mediated by the non-covalent association of the known polypeptide and test polypeptide; and, (d) detecting reassembly of GFP, wherein reassembly of GFP indicates that the test polypeptide interacts with the known polypeptide.
A related method may further comprise obtaining nucleic acids encoding the first and second fusion proteins and cotransfecting or cotransforming the nucleic acids into a cell to obtain the first and second fusion protein.