This invention describes a method of removing N-terminal alanine residues from polypeptides, preferably recombinant proteins, using an aminopeptidase derived from the marine bacterium Aeromonas proteolytica. Accordingly, Aeromonas aminopeptidase (AAP; E.C. 3.4.11.10) can be used to remove N-terminal alanyl residues from derivatives of human somatotropin (hST, human growth hormone or hGH), porcine somatotropin (pST), and bovine somatotropin (bST), for example, to yield proteins having their native amino acid sequences. The enzyme reactions can be carried out in free solution, or the AAP can be immobilized on a solid support, for reactions carried out in vitro. An efficient method for converting Ala-hGH to hGH, for example, comprises expression of Ala-hGH in E. coli, recovery of inclusion bodies, solubilization and refolding in detergent, detergent removal by ultrafiltration, selective precipitation, enzyme cleavage, followed by two column chromatography steps.
Recombinant proteins that mimic or have the same structure as native proteins are highly desired for use in therapeutic applications, as components in vaccines and diagnostic test kits, and as reagents for structure/function studies. Mammalian, bacterial, and insect cells are commonly used to express recombinant proteins for such applications. Bacterial expression systems, however, are often used when large quantities of protein are needed for experimental or clinical studies, and the protein is capable of being refolded to its proper conformation. Bacterial systems, in particular, offer significant cost advantages over other expression vector systems when eukaryotic post-translational modifications (e.g., glycosylation) are not required, or desired, in the final protein product.
Recombinant proteins expressed in bacteria, such as E. coli, are often sequestered into insoluble inclusion bodies. Heterologous proteins harvested from inclusion bodies often retain an additional amino acid residue such as methionine at their amino terminus. This methionine residue (encoded by the ATG start codon) is often not present, however, on native or recombinant proteins harvested from eukaryotic host cells. The amino termini of many proteins made in the cytoplasm of E. coli, however, are processed by enzymes, such as methionine aminopeptidase (Ben Bassat et al., J. Bacteriol. 169: 751-757, 1987), so that upon expression the methionine is ordinarily cleaved off the N-terminus.
The amino acid composition of protein termini are biased in many different manners (Berezovsky et al., Protein Engineering 12(1): 23-30, 1999). Systematic examination of N-exopeptidase activities led to the discovery of the xe2x80x98N-terminalxe2x80x99- or xe2x80x98N-end rulexe2x80x99: the N-terminal (f)Met is cleaved if the next amino acid is Ala, Cys, Gly, Pro, Ser, Thr, or Val. If this next amino acid is Arg, Asp, Asn, Glu, GIn, Ile, Leu, Lys or Met, the initial (f)Met remains as the first amino acid of the mature protein. The radii of hydration of the amino acid side chains was proposed as physical basis for these observations (Bachmain et al., Science, 234: 379-186, 1986, Varshavsky, Cell, 69: 725-735, 1992). The half-life of a protein (from 3 min to 20 hours), is dramatically influenced by the chemical structure of the N-terminal amino acid (Stewart et al., J. Biol. Chem., 270: 25-28, 1995; Griegoryev et al., J. Biol. Chem., 271: 28521-28532, 1996). Site-directed mutagenesis was subsequently used to confirm the xe2x80x98N-end rulexe2x80x99 by monitoring the life-span of recombinant proteins containing altered N-terminal amino acid sequences (Varshavsky, Proc. Natl. Acad. Sci. USA, 93: 12142-12149, 1996). A statistical analysis of the amino acid sequences at the amino termini of proteins suggested that Met and Ala residues are over-represented at the first position, whereas at positions +2 and +5, Thr is preferred (Berezovsky et al., Protein Engineering 12(1): 23-30, 1999). C terminal biases, however, show a preference for charged amino acids and Cys residues (Berezovsky et al., Protein Engineering, 12(1): 23-30, 1999).
Recombinant proteins that retain the N-terminal methionine, in some cases, have biological characteristics that differ from the native species lacking the N-terminal methionine Human growth hormone that retains its N-terminal methionine (Met-hHG), for example, can promote the induction of undesireable antibodies, compared to hGH purified from natural sources or recombinant hGH that is prepared in such a way that has the same primary sequence as native hGH (lacking an N-terminal methionine). Low-cost methods of generating recombinant protein that mimic the structure of native proteins are often highly desired for therapeutic applications (Sandman et al., Bio/Technology 13:504-6 (1995)).
One method of preparing native proteins in bacteria is to express the desired protein as part of a larger fusion protein containing a recognition site for an endoprotease that specifically cleaves upstream from the start of the native amino acid sequences. The recognition and cleavage sites can be those recognized by native signal peptidases, which specifically clip the signal peptide of the N-terminal end of a protein targeted for delivery to a membrane or for secretion from the cell. In other cases, recognition and cleavage sites can be engineered into the gene encoding a fusion protein so that recombinant protein is susceptible to other non-native endoproteases in vitro or in vivo. The blood clotting factor Xa, collagenase, and the enzyme enterokinase, for example, can be used to release different fusion tags from a variety of proteins. Economic considerations, however, generally preclude use of endoproteases on a large scale for pharmaceutical use.
Another method of preparing native proteins in bacteria is to use the enzyme methionine aminopeptidase (MAP) to process the N-terminal methionine from E. coli-derived recombinant proteins. Met-hGH, for example, can be treated with MAP to generate hGH. U.S. Pat. Nos. 4,870,017 and 5,013,662 describe the cloning, expression, and use of E. coli methionine aminopeptidase to remove Met from a variety of peptides and Met-IL-2. The ability to release amino acids from a variety of peptide substrates was analyzed, revealing that MAP cleaves only N-terminal methionine on peptides that are at least three amino acids long. The nature of amino acids in the second and third positions also appear to be significant. Methionine was released, for example, from Met-Ala-Met; Met-Gly-Met, Met-Ala-Pro-Thr-Ser-Ser-Ser-Thr-Lys-Lys-Thr-Gln-Leu (SEQ ID NO: 1), and Met-Pro-Thr-Ser-Ser-Ser-Thr-Lys-Lys-Gln-Cys (SEQ ID NO: 2), but not Met-Phe-Gly, Met-Leu-Phe, Met-Met-Met, among others. No amino acids were released from Leu-Ala-Pro-Thr-Ser-Ser-Ser-Thr-Lys-Lys-Thr-Gln-Leu (SEQ ID NO: 3), Ala-Pro-Thr-Ser-Ser-Ser-Thr-Lys-Lys-Thr-Gin-Leu (SEQ ID NO. 4), or Pro-Thr-Ser-Ser-Ser-Thr-Lys-Lys-Gln-Cys (SEQ ID NO: 5).
WO 84/02351 discloses a process for preparing ripe (native) proteins such as human growth hormone or human proinosulin from fusion proteins using leucine aminopeptidase. A fusion protein having t he amino acid sequence (Ym . . . Y2Y1)-(Pro)p(X1X2 . . . Xp) in which the Ym . . . Y2Y1)-(Pro)p is the pro-sequence and the rest is the ripe protein, m is an integer greater than 2, Y is an arbitrary amino acid, p is 0, if X1 or X2 is Pro, and 1 if X1 or X2 is different from Pro, X is an arbitrary amino acid, and n is an integer greater than or equal to 4, is converted by stepwise cleavage with aminopeptidase removing amino acids Ym . . . Y2 if p=1 or X=Pro, or the groups Yn . . . Y2-Y1 if X2=Pro and then the two amino acids Y1-Pro if p=1 are cleaved off enzymatically in one or two steps in a manner known per se, and similarly Y1 alone is cleaved off, if X1=Pro.
European Patent Application EP 0 204 527 A1 discloses a method of removing the N terminal methionine from proteins of the formula H-Met-X-Pro-Y-OH to yield a protein represented by the formula H-X-Pro-Y-OH, where X is an amino acid other than proline and Y is a peptide chain. Aminopeptidase M was preferred, but leucine aminopeptidase, aminopeptidase PO, or arninopeptidase P could also be used. The N-terminal methionine was removed from derivatives of human interleukin-2 and human growth hormone.
Aeromonas aminopeptidase (AAP), an exo-peptidase isolated from the marine bacterium Aeromonas proteolytica, can also be used to facilitate the release of N-terminal amino acids from peptides and proteins (Wilkes et al., Eur. J. Biochem. 34(3). 459-66, 1973). The most favorable sequence is X-//-Y- whereas Y is a hydrophobic residue, preferably an aromatic amino acid, such as phenylalanine. Residues susceptible to hydrolysis include all hydrophobic, aromatic, and basic amino acids, plus proline. Aspartyl, glutamyl, and cysteic acid residues were not removed from the amino terminus of any substrate tested, even at high enzyme concentrations. Asparagine, glutamine, and aminoethylated cysteine, however, were released from oligopeptide substrates. Glycine was generally resistant to hydrolysis, but was slowly released from some substrates, depending on the adjacent residues. The activity of AAP on peptide substrates can also be enhanced by changing the counter metal ions, such as Cu2+ and Ni2+ for free enzyme AAP (Prescott et al., Biochem. Biophys. Res. Comm. 114(2): 646-652, 1983). AAP is a 29.5 kDa metalloenzyme containing two disulfide bonds.
European patent EP 0191827 B1 and U.S. Pat. No. 5,763,215 describe the sequential removal of N-terminal amino acids from analogs of eukaryotic proteins, formed in a foreign host, by use of Aeromonas aminopeptidase. When 8 mg methionyl-human growth hormone (Met-hGH) (dissolved in 1 mL pH 9.5 10 mM Na borate buffer) was mixed with Aeromonas aminopeptidase (dissolved at 0.4725 mg/mL in pH 9.5 Tris buffer) at a ratio of 900 to 19 and incubated at 37xc2x0 C., methionine cleavage was complete in 15 min. Cleavage of the new N-terminal amino acid leucine, was slight after 22 h. The N-terminal methionine from Met-pST, Met-Interferon, Met-IGF-1, Met-interleukin-2 (Met-lL2), and Met-Apolipoprotein E were removed by AAP. An N-terminal alanine, however, was not removed from mature superoxide dismutase.
More complicated methods can also be used to generate recombinant proteins with a native amino terminus. U.S. Pat. No. 5,783,413, for example, describes the simultaneous or sequential use of (a) one or more aminopeptidases, (b) glutamine cyclotransferase, and (c) pyroglutamine aminopeptidase to treat amino-terminally-extended proteins of the formula NH2-A-glutamine-Protein-COOH to produce a desired native protein. The first aminopeptidase(s) (selected from the group consisting of dipeptidylaminopeptidase I, Aeromonas aminopeptidase, aminopeptidase P, and proline aminopeptidase), catalyze the removal of residues amino-terminal to the glutamine. The glutamine cyclotransferase catalyzes the conversion of the glutamine to pyroglutamine, and the pyroglutamine aminopeptidase catalyzes the removal of pyroglutamine to produce the desired protein product.
U.S. Pat. Nos. 5,565,330 and 5,573,923 disclose methods of removing dipeptides from the amino-terminus of precursor polypeptidee involving treatment of the precursor with dipetidylaminopeptidase (dDAP) from the slime mold Dictostelium descoideum, which has a mass of about 225 kDa and a pH optimum of about 3.5. Precursors of human insulin, analogues of human insulin, and human growth hormone containing dipeptide extensions were processed by dDAP when the dDAP was in free solution and when it was immobilized on a suitable solid support surface.
More efficient strategies to process amino acids from the amino terminus of recombinant proteins are desirable to reduce the cost of generating therapeutic proteins that mimic the structure of native proteins. Methods that increase the levels of expression or facilitate the downstream processing of recombinant proteins will also accelerate the selection and development of small chemical molecules and other protein-based molecules destined for large scale clinical trials.
It is an object of the invention is to describe a method of removing an N-terminal alanyl group from a recombinant protein which comprises contacting said recombinant protein with Aeromonas aminopeptidase such that said N-terminal alanyl group is removed and recovering the resulting recombinant protein.
Preferably, the recombinant protein is of eukarvotic origin. Even more preferably, the recombinant protein is selected from the group consisting of human growth hormone (hGH), bovine somatotropin (bST), porcine somatotropin (pST), and human tissue factor pathway inhibitor (TFIPI). Most preferably, the recombinant protein is hGH.
Preferably, the contacting process is carried out at a pH from about pH 7 to about pH 11. Even more preferably, the contacting process is carried out at a pH from about pH 8 to about pH 10. Most preferably, the contacting process is carried out at a pH of about pH 8.0 to about pH 9.5.
Preferably, the contacting process is carried out in the presence of a buffer selected from the group consisting of borate, CHES, sodium bicarbonate, sodium phosphate, and Tris-HCl. More preferably, the buffer is borate, phosphate, or Tris-HCl.
Preferably, the contacting process can be carried out wherein the aminopeptidase is immobilized. Preferably, the aminopeptidase is immobilized on a chromatography resin, chromatography surface, or chromatography gel.
Preferably, the recombinant protein is passed through a column containing the aminopeptidase immobilized on a chromatography resin.
Alternatively, the contacting process can be carried out wherein the aminopeptidase is not immobilized (e.g., in free solution).
It is also conceivable that Aeromonas aminopeptidase may permit the processing of proteins containing two or more closely-spaced alanyl residues in the N-terminal regions of polypeptides. The aminopeptidase may proceed sequentially from the N-terminus of the polypeptide or perhaps recognize additional alanine residues within a short distance of exposed N-terminal polypeptide residues. Elucidation of a consensus sequence for the alanyl-specific recognition and cleavage sites can be evaluated on a variety of protein and peptide substrates.
Aeromonas aminopeptidase may also facilitate the processing of non-alanyl residues in the N-terminal regions of proteins under the conditions disclosed in this application. Elucidation of a consensus sequence for the recognition and cleavage of these sites can be evaluated on a variety of protein and peptide substrates.
It is another object of the invention is to describe a method of removing amino-terminal amino acids from a precursor polypeptide of the formula X-Y-Pro-Z with Aeromonas aminopeptidase to field a polypeptide of the formula Y-Pro-Z, wherein X is one or more amino acids except proline, Y is any amino acid except proline, and Z is one or more amino acids.
Preferably, X is alanine.
Preferably, Y is selected from the group consisting of phenylalanine, methionine, threonine, and aspartic acid. More preferably, Y is phenylalanine.
Most preferably, X is alanine and Y is phenylalanine.
Preferably, the precursor polypeptide is Ala-hGH.
The following is a list of abbreviations and the corresponding meanings as used interchangeably herein:
g=gram(s)
HPLC=high performance liquid chromatography
kb=kilobase(s)
mb megabase(s)
mg=milligram(s)
ml, mL=milliliter(s)
RP-HPLC=reverse phase high performance liquid chromatography
ug, xcexcg=microgram(s)
ul, uL, xcexcl, xcexcL=microliter(s)
The following is a list of definitions of various terms and the corresponding meanings as used interchangeably herein:
The terms xe2x80x9caaprxe2x80x9d and xe2x80x9cAAPxe2x80x9d mean Aeromonas aminopeptidase.
The terms xe2x80x9capxe2x80x9d and AP xe2x80x9cmeanxe2x80x9d aminopeptidase.
The term xe2x80x9camino acid(s)xe2x80x9d means all naturally occurring L-amino acids, including norleucine, norvaline, homocysteine, and ornithine.
The term xe2x80x9cfusion moleculexe2x80x9d means a protein-encoding molecule or fragment thereof that upon expression, produces a fusion protein.
The term fusion xe2x80x9cproteinxe2x80x9d means a protein or fragment thereof that comprises one or more additional peptide regions not derived from that protein.
The term xe2x80x9cpromoterxe2x80x9d is used in an expansive sense to refer to the regulatory sequence(s) that control mRNA production.
The term xe2x80x9cprotein fragmentxe2x80x9d means a peptide or polypeptide molecule whose amino acid sequence comprises a subset of the amino acid sequence of that protein.
The term xe2x80x9cprotein molecule/peptide moleculexe2x80x9d means any molecule that comprises five or more amino acids.
The term xe2x80x9crecombinantxe2x80x9d means any agent (e.g., DNA, peptide, etc.), that is, or results from, however indirectly, human manipulation of a nucleic acid molecule.
The term xe2x80x9cspecifically bindxe2x80x9d means that the binding of an antibody or peptide is not competitively inhibited by the presence of non-related molecules.
The term xe2x80x9csubstantially-purifiedxe2x80x9d means that one or more molecules that are or may be present in a naturally-occurring preparation containing the target molecule will have been removed or reduced in concentration.