The present invention relates to polypeptides having the capacity to display auto-cleavage, polynucleotides encoding such polypeptides, and uses of such polypeptides and polynucleotides for reversibly binding proteins to specific substrates, reversibly binding specific substrates to each other, and for splicing amino acid sequences. More particularly, the present invention relates to chimeric polypeptides capable of auto-cleaving at defined locations, including auto-cleaving resulting in defined auto-splicing, to polynucleotides suitable for expressing such polypeptides, and to methods of using such polypeptides and polynucleotides for protein purification, affinity selection of display phages, and post-translational ligation of proteins.
Autoprocessing protein domains, such as inteins and Hogs, have the capacity to post-translationally auto-cleave or auto-splice flanking polypeptide sequences and thereby serve as unique and potent protein engineering tools useful in various applications, including protein purification, affinity selection of display phages, generation of cytotoxic proteins, segmental modification or labeling of proteins, protein or peptide cyclization, and generation of reactive polypeptide termini in expressed proteins for various biochemical reactions, including protein ligation (Perler and Adam, 2000. Curr Opin Biotechnol. 11, 377-83). However, the usefulness of the presently available repertoire of autoprocessing polypeptides is hampered by various limitations, as described in further detail hereinbelow.
Inteins are internal protein domains naturally occurring in a variety of host proteins (Hirata et al., 1990. J. Biol. Chem. 265, 6726-6733; Kane et al., 1990. Science 250, 651-657; Perler et al., 1994. Nucl. Acids Res. 22, 1125-1127; Noren et al., 2000. Angew. Chem. Int. Ed. 39, 450). Inteins have been found in organisms from all three domains of life, including in yeast and algal chloroplasts (eukaryotes), mycobacteria and cyanobacteria (bacteria), and thermophilic archaea (archaea). So far, no essential biological role has been shown for inteins, and all of their identified functions involve their own preservation and maintenance, with no apparent benefit to the host protein and organism (reviewed in Pietrokovski, 2000. Trends in Genetics 17, 465-472). At least some inteins are multifunctional, being able to both catalyze their own protein splicing and to home a copy of their gene into intein-less alleles (Gimble and Thorner, 1992. Nature 357, 301; Chong et al., 1996. J Biol. Chem. 271, 22159). Hogs are protein domains found in Hedgehogs which are proteins composed of an amino terminal Hedge protein domain and a carboxy terminal Hog protein region (Aspock G., 1999. Genome Res. 9, 909; Hammerschmidt et al., 1997. Trends Genet. 13, 14). Other protein domains, such as various Caenorhabditis elegans carboxy terminal domains, are believed to autocatalytically cleave themselves from host proteins, thereby modulating the activity of the amino terminal parts (Burglin, 1996. Curr Biol. 6, 1047; Porter et al., 1996. Cell 86, 21), similarly to Hogs.
Members of the intein and Hog protein domain families share the capacity to autocatalytically cleave the peptide bond joining them to polypeptides flanking their amino terminal ends (“amino terminal cleavage”). Inteins have the further capacity to cleave the peptide bond joining them to polypeptides flanking their carboxy terminal ends (“carboxy terminal cleavage”) while splicing polypeptides flanking their amino and carboxy terminal ends (termed “exteins”), resulting in self-excision of the intein from the host protein, and concomitant ligation of the flanking extein domains with a peptide bond. Thus, intein-containing host proteins undergo a switch from an intein-containing state to an intein-less state via such a process. Most reported inteins furthermore also contain an endonuclease domain whose function is to mediate the copying of the intein gene into specific unoccupied genomic insertion points, thereby enabling intein propagation.
Both inteins and Hogs share a similar structure fold and contain characteristic “Hint” consensus motifs which mediate the biochemical reactions involved in the autocatalytic activities of these protein domains (Hall T M., 1997. Cell 91, 85; Pietrokovski S., 1994. Protein Sci. 3, 2340; Pietrokovski S., 1998. Protein Sci. 7, 64; Paulus, 2000. Annu. Rev. Biochem. 69, 447). These Hint motif-mediated biochemical reactions are similar in both inteins and Hogs, but are involved in different biological processes (Dalgaard et al., 1997. J Comput Biol. 4, 193; Hall et al., 1997. Cell 91, 85; Pietrokovski S., 1998. Protein Sci. 7, 64; Xu and Perler, 1996. EMBO J. 15, 5146). The initial biochemical reactions of intein and Hog amino terminal cleavage are identical; the peptide bond attaching the amino terminal end of the Hint domain to an amino terminal-flanking sequence is converted into a thioester (or ester) bond, a trans-esterification reaction then covalently attaches the sequence flanking the carboxy terminal end of the intein, or a cholesterol molecule in the case of Hog proteins, to the amino terminal flanking sequence, thereby cleaving the bond attaching the amino terminal sequence to the Hint domain. In a process essential for organismal development, Hint-mediated autocatalytic excision of the carboxy terminal Hog protein domain from the amino terminal Hedge protein domain in Hedgehogs leads to covalent attachment of a cholesterol molecule to the carboxy end of the Hedge domain, leading to its activation and secretion from the cell (Porter J A. et al., 1996. Cell 86, 21; Porter, J A. et al., 1996. Science 274, 255). In the case of inteins, protein splicing is effected sequentially by cleavage of the bond attaching the intein amino terminal end to the carboxy terminal extein, ligation of the amino and carboxy terminal exteins, and cleavage of the bond attaching the intein carboxy terminal end to the carboxy terminal extein.
Mechanistic studies have determined the roles of highly conserved residues positioned near the intein/extein junctions in the splicing reaction (Chong et al., 1996. J. Biol. Chem. 271, 22159-22168; Xu et al., 1996. EMBO J. 15, 5146-5153; Stoddard et al., 1998. Nat. Struct. Biol. 5, 3). These residues include: the Cys, Ser or Thr residue forming the amino terminal end of the intein, which initiates splicing with an acyl shift; the conserved Cys, Ser or Thr residue flanking the carboxy terminal end of the intein, which ligates the exteins through nucleophilic attack; and the conserved Asn forming the carboxy terminal end of the intein, which releases the intein from the ligated exteins via succinimide formation. The amino terminal acyl shift and the carboxy terminal succinimide formation cleavage activities of the intein are separable. The amino terminal cleavage takes place in two separate steps. In the first step, as described above, the peptide bond between the intein and the amino terminal extein is converted to a thioester (or ester in some cases). In the second step, the thioester bond is cleaved by a nucleophilic attack from the side-chain of the residue flanking the carboxy terminal end of the intein, causing a transesterification reaction.
Because the structural information required for splicing exists entirely within inteins, and since the process of splicing has no energy requirements (for example hydrolysis of ATP), such protein domains can be used in a variety of applications involving intein insertion into foreign contexts. Various methods have been used in attempts to control and alter intein-mediated functions. Since endonuclease activity is not required for protein splicing, mini-inteins with accurate splicing activity have been generated by deletion of this central domain (Derbyshire et al., 1997. Proc. Natl. Acad. Sci. USA. 94, 11466; Chong et al., 1997. J. Biol. Chem. 272, 15587; and Shingledecker et al., 1998. Gene 207, 187). Also, mutation of residues near the intein/extein junctions has been used to alter intein activity, for example, to yield isolated cleavage at one or both of the intein-extein junctions (Chong et al., 1998. J. Biol. Chem. 273, 10567).
Thus, the ability to modulate the function of autoprocessing polypeptides such as inteins has broad potential application, as described above. In the case of protein purification where an autoprocessing polypeptide is used in conjunction with an affinity group to purify a desired target protein (Chong et al., 1997. Gene 192, 271-281; Chong et al., 1998. Nucl. Acids Res. 26, 5109), purification of a target protein is effected by co-expressing the target protein as a fusion protein containing a purification tag in one terminal segment, an internal autoprocessing polypeptide, and a target protein forming the other terminal segment. Such fusion proteins are exposed to affinity purification matrices designed to capture the tagged molecule. The target protein is then selectively released from the purification matrix by inducing autoprocessing polypeptide-mediated auto-cleavage of the peptide bond attaching the target protein to the autoprocessing polypeptide. Such a procedure is advantageous since autoprocessing polypeptide cleavage affects the fusion protein only, and thus non-specifically bound contaminant proteins are not released into the product stream. Furthermore, such a method does not employ contaminating and expensive proteases, such as those used in technologies employing protease-mediated cleavage of purification-tagged target proteins. The aforementioned strategy forms the basis of the protein purification systems such as the commercially available IMPACT-CN system (New England Biolabs, Beverly, Mass.).
However, prior art methods of using such autoprocessing polypeptides for applied uses have numerous drawbacks. In applied systems such as IMPACT-CN, the accessory molecule involved in cleavage of the thioester bond between the intein and the extein following amino terminal cleavage must be effected with a strong thiol-containing nucleophile such as 2-mercaptoethanol or dithiothreitol (DTT), both of which are strong reducing agents which modify the carboxy terminal end of the extein. In such systems, although initial thioester formation is mediated by the intein, the actual cleavage of the extein is effected via non-enzymatic chemical cleavage of a thioester bond by a small nucleophilic molecule, thereby severely limiting the maximal reaction rates achievable. While such systems allow carboxy terminal cleavage, such cleavage has the drawback of resulting in undesirable amino terminal cleavage, thereby requiring the amino terminal fragment to be removed in an additional purification step. Furthermore, despite insights into intein structure and function, modifications often result in unacceptably low activity, poor precursor stability, or insolubility (Derbyshire et al., 1997. Proc. Natl. Acad. Sci. USA. 94, 11466; Chong et al., 1997. Gene 192, 271-281; Shingledecker et al., 1998. Gene 207, 187; Chong et al., 1998. Nucl. Acids Res. 26, 5109).
Thus, all prior art approaches have failed to provide an adequate solution for providing autoprocessing polypeptides optimal for protein engineering applications.
There is thus a widely recognized need for, and it would be highly advantageous to have, autoprocessing polypeptides devoid of the above limitation.