The present invention relates to protein tags and methods of protein purification using various recombinant DNA techniques. More particularly, the invention is directed to novel identification polypeptides and DNA vectors encoding novel identification polypeptides containing multiple antigenic domains joined in tandem. Also provided are methods for using such identification polypeptides for the purification of target peptides and methods of constructing DNA vectors encoding the novel identification polypeptides and DNA expression vectors encoding the identification polypeptides linked to a target peptide.
Proteinaceous molecules such as enzymes, hormones, storage proteins, binding proteins, transport proteins and signal transduction proteins may be produced and purified using various recombinant DNA techniques. For instance, DNA fragments coding for a selected protein, together with appropriate DNA sequences for a promoter and ribosome binding site are ligated to a plasmid vector. The plasmid is inserted within a host prokaryotic or eukaryotic cell. Transformed host cells are identified, isolated and then cultivated to cause expression of the proteinaceous molecules. One method used to purify hybrid polypeptides is the poly-arginine system in which a hybrid polypeptide is selectively purified on a cation exchange resin. See Sassenfeld, H. M. and Brewer, S. J. BioTechnology, 2:76 (1984); U.S. Pat. No. 4,532,207. Sassenfeld and Brewer reported a carboxy-terminal extension of five arginine residues fused to a target protein. This basic polyarginine extension allowed the purification of the hybrid polypeptide on a SP-Sephadex resin. An analogous protein expression and purification system employs a polyhistidine tract or tag at either the amino- or carboxy-terminus of the hybrid polypeptide. The fusion protein is purified by chromatography on a Ni2+ metal affinity resin. See Porath, J., Protein Expression and Purification, 3:7995 (1992).
Additionally, various affinity purification protocols are currently employed to facilitate the isolation of fusion proteins. Affinity chromatography is based on the capacity of proteins to bind specifically and noncovalently with a ligand. Used alone, it can isolate proteins from very complex mixtures with not only a greater degree of purification than possible by sequential ion-exchange and gel column chromatography, but also without significant loss of activity. Typically, a ligand capable of binding with high specificity to an affinity matrix is chosen as the fusion partner. For example, p-aminophenyl-xcex2-D-thiogalactosidyl-succinyldiaminohexyl-Sephar ose selectively binds to xcex2-galactosidase allowing the purification of xcex2-gal fusion proteins. See Germino et al., Proc. Natl. Acad. Sci. USA 80:6848 (1983). Other expression systems which permit the affinity purification of fusion proteins include fusion proteins made with glutathione-S-transferase, which are selectively recovered on glutathione-agarose. See Smith, D. B. and Johnson, K. S. Gene 67:31 (1988). IgG-Sepharose can be used to affinity purify fusion proteins containing staphylococcal protein A. See Uhlen, M. et al. Gene 23:369 (1983). The maltose-binding protein domain from the malE gene of E. coli has been used as a fusion partner and allows the affinity purification of the fusion protein on amylose resins.
Another method used to detect and isolate proteins is by use of an epitope tag. Epitope tagging utilizes antibodies against guest peptides to study protein localization at the cellular level and subcellular levels. See Kolodziej, P. A. and Young, R. A., Methods Enzymol., 194:508-519 (1991). Using recombinant DNA technology, a sequence of nucleotides encoding the epitope is inserted into the coding region of the cloned gene, and the hybrid gene is introduced into a cell by a method such as transformation. When the hybrid gene is expressed the result is a chimeric protein containing the epitope as a guest peptide. If the epitope is exposed on the surface of the protein, it is available for recognition by the epitope-specific antibody, allowing the investigator to observe the protein within the cell using immunofluorescence or other immunolocalization techniques. Further, fusion proteins labeled with such epitope tags are frequently used for purifying proteins utilizing affinity purification techniques.
Thus, epitope tagging has become a powerful tool for the detection and purification of expressed proteins. See Kolodziej, P. A. and Young, R. A., Methods Enzymol., 194:508-519 (1991). Many types of tags have been used, with c-myc and FLAG(copyright) tags being two of the most popular epitopes used. See Evan et al., Mol Cell Biol. 5:3610-3616 (1985). Generally, these epitopes are fused to the amino or carboxy-terminus of the expressed protein making them more accessible to the antibody for detection and less likely to cause severe structural or functional perturbations.
Fusion proteins having the FLAG(copyright) octapeptide Asp-Tyr-Lys-Asp-Asp-Asp-Asp-Lys (SEQ ID NO:1) at the amino-terminus can be affinity purified on an immuno-affinity resin containing an antibody specific for the octapeptide, See Hopp, T. P., et al. Biotechnology, 6:1204 (1988); Prickett, K. S., et al., BioTechniques, 7:580 (1989); and U.S. Pat. No. 4,851,341. The FLAG(copyright) epitope tag has been effectively used to detect and purify protein in mammalian and bacterial systems. The original FLAG(copyright) sequence is recognized by two antibodies, M1, M2, and a FLAG(copyright) sequence with an initiator methionine attached is recognized by a third antibody, M5. The last five amino acids of the FLAG(copyright) sequence is a recognition site for the protease enterokinase, thus, allowing for removal of the FLAG(copyright) epitope. The FLAG(copyright) epitope has been used in various expression systems for detection and purification of heterologous proteins e.g., in E. coli (Brizzard et al., BioTechniques, 16:730-735 (1994)), Saccharomyces cerevisiae (Lee et al., Nature, 372:739-746 (1994); Prickett et al., BioTechniques, 7:580-589 (1989)), Drosophila (Xu et al., Development, 117:1223-1237 (1993)), Baculovirus (Dent et al., Mol.Cell Biol, 15:4125-4135 (1995); Ritchie et al., Biochem Journal, 338:305-10 (1999)), and mammalian systems (Overholt et al., Clin. Cancer Res., 3:185-191 (1997); Schulte am Esch et al., Biochemistry, 38:2248-2258 (1999)). However, in many mammalian expression systems, protein expression levels are low and effective detection of expressed foreign proteins using established methods can be difficult.
There is therefore a need for an epitope tag and expression system employing such epitope tags which would allow for increased sensitivity and detection of recombinant proteins.
The present invention addresses one or more of the foregoing problems by providing methods and vehicles which can be used to produce high yields of recombinant proteins. Accordingly, among the several objects of the present invention may be noted the provision of a novel identification polypeptide, a hybrid molecule composed of a target peptide fused to the novel identification polypeptide and recombinant DNA vectors encoding the same. Also provided are methods for the purification of the target peptide wherein a single ligand or multiple ligands, preferably antibodies may be employed to isolate and purify substantially all protein molecules expressed by transformed host cells, whether antigenic or not. A further object of the present invention is to provide processes which can be used to highly purify any protein molecule produced by recombinant DNA methods, including those that are not susceptible to affinity chromatography procedures.
Briefly, therefore, the present invention is directed to an identification polypeptide comprising multiple copies of an antigenic domain joined together in tandem. The identification polypeptide may contain a linking sequence containing a cleavable site located adjacent to the target peptide wherein the cleavable site is not located in or interposed between the individual antigenic domains. Each antigenic domain is capable of eliciting an antigenic response and can be bound by a ligand, preferably an antibody. Further, each antigenic domain is comprised of a combination of at least two, preferably three or more different amino acids.
Also provided are fusion proteins of the present invention comprising the novel identification polypeptide fused to a target peptide. The identification polypeptide contains a linking sequence which is characterized by being cleavable at a specific amino acid residue adjacent to the target peptide by use of a sequence specific proteolytic agent. Such cleavable site is located adjacent to either the carboxy-terminus or amino-terminus of the target peptide, preferably located immediately adjacent to the amino-terminus of the target peptide. Ideally, the amino acid sequence of the cleavable site is unique, thus minimizing the possibility that the proteolytic agent will cleave the target peptide. In a preferred embodiment, the cleavable site comprises amino acids specific for enterokinase, thrombin or a Factor Xa.
In accordance with this particular construct of the fusion protein, the target peptide may be isolated by affinity chromatography techniques. Thus, it is an object of the invention to provide methods for the purification of the target peptide. This is accomplished by constructing an affinity column with immobilized ligands specific for the antigenic domains of the identification polypeptide thereby binding the fusion protein. It will be appreciated that by virtue of the present invention, a singular antibody or multiple antibodies may be used to bind to the individual antigenic domains comprising the multiple antigenic domains of the identification polypeptide. Then the bound fusion protein can be liberated from the column and the identification polypeptide cleaved with an appropriate proteolytic agent, thus releasing a purified target peptide. In a preferred embodiment, the proteolytic agent used to cleave the target peptide from the identification polypeptide is selected from the group consisting of enterokinase, thrombin and Factor Xa.
A further object of the present invention is to provide a recombinant cloning vector containing DNA encoding for the identification polypeptide. The vector encoding for the identification polypeptide also includes DNA sequences coding for a multiple cloning site comprised of multiple restriction enzyme sites which may be located between the antigenic domains or on either side of the antigenic domains which will enable one skilled in the art to insert any number of DNA sequences encoding for any desired protein. This DNA sequence may be inserted within a cloning vector such as a plasmid, by use of appropriate restriction endonucleases and ligases. The recombinant plasmid is employed to transform compatible prokaryotic or eukaryotic host cells for replication of the plasmid and expression of the hybrid affinity domain/protein molecule. Ideally, the plasmid has a phenotypic marker gene for identification and isolation of transformed host cells. In a preferred embodiment, DNA sequences encoding for a secreted signal peptide will be joined either to the DNA vector or to the plasmid thus enabling the transformed host cells to be readily identified and separated from cells which do not undergo transformation.
Other objects and features will be in part apparent and in part pointed out hereinafter.