The Need for Longer Synthetic Polypeptides
With the recent advances in knowledge coming from gene sequencing and direct protein identification projects, there is a great need for proteins and polypeptides to study function. Ideally, such proteins are made chemically since this approach allows not only rapid access, but also full flexibility of incorporation of reporter groups (fluorophores, stable isotope labels, etc.) and other components which are not coded for genetically. Several chemically synthesized polypeptides and proteins are already (and others are being evaluated as potential) in vivo diagnostic and therapeutic agents (drugs). The chemical synthesis of polypeptides is now routine: for the solution approach, see e.g. Sakakibara, S (1999) Biopolymers 51:279-296 “Chemical synthesis of proteins in solution”, and Proc Natl Acad Sci USA (1998) 95:13549-13554; for the solid phase approach, see Methods in Enzymology vol. 289 “Solid phase peptide synthesis” and e.g. Miranda L P & Alewood P F (1999) Proc Natl Acad Sci USA 96:1181-1186 “Accelerated chemical synthesis of peptides and small proteins”, and Kochendoerfer, G G & Kent, S B (1999) Curr Opin Chem Biol 3:665-671 “Chemical protein synthesis”. Nonetheless, it is currently difficult to synthesize and purify polypeptides greater than about 60 residues in length, so that longer polypeptides are generally not synthesized by solid phase peptide synthesis (SPPS). Instead, two or several shorter polypeptides are synthesized and these are deprotected, purified, coupled together in pairs in solution and the final product is then repurified. In general, peptides possessing N-terminal Cys are used in a coupling reaction referred to as <<native chemical ligation>> (Cotton, G J & Muir T W (1999) Chemistry & Biology 6:R247-R256 “Peptide ligation and its application to protein engineering”).
Methods capable of facilitating ligation at residues other than Cys would be most useful since they would extend the range of polypeptides accessible to total chemical synthesis. Such methods are being developed by various groups, and it is now possible to ligate using an N-terminal Gly, homo-Cys (which becomes Met upon alkylation), and His. The ability to synthesize routinely by solid phase methodology polypeptides of 100-120 residues (i.e. entire small proteins, or fragments for chemical ligation) will have a major impact on the proportion of proteins coded for by the genome which are synthetically accessible, and modular chemical synthesis (fragments) and resin splitting permits easy access to variants and labelled versions.
The Problem of Deletions Arising from Incomplete Coupling
It would be useful to be able to synthesize longer polypeptides directly by SPPS, thus avoiding the time and effort needed to synthesize, purify and ligate several shorter fragments. Also, if longer polypeptides could be made in good yield and purity then even longer polypeptides could be made by ligating two or more of such longer fragments. Unfortunately, as is well known, the chemistry used to add amino acid residues during SPPS is not quite quantitative, and so each cycle gives rise to an impurity (which is the first member of a set of impurities, growing with every succeeding cycle) which contains a deletion at that cycle. Thus, with every cycle, there is a small loss of yield of correct (full-length) product, and in particular there is an increase in the complexity of the range of deletion peptides present. For example, if the coupling reaction achieves 99% yield at each step, after 100 such reactions there will be 0.99100×100%=37% correct (full-length) product, and 63% (in molar terms) of an astronomical number (2100=1030) of impurities lacking at least one amino acid residue. Practically, this astronomical number is actually limited by the Avogadro number: since most syntheses are performed on a millimole scale, the number of impurities is limited to about 1020. If the coupling reaction is 95% efficient, the yield after 100 steps falls to 0.95100×100=0.59%, and the mixture of (theoretically) 1020-1030 impurities now accounts for 99.41% (on a moles basis). Clearly, it is important to force coupling (and deprotection) reactions to be as quantitative as possible in order to obtain good yield, and it is also important to be able to purify wanted full-length product from a myriad of impurities. This problem of deletions arising from incomplete couplings is well known (Methods in Enzymology, vol 289, devoted to Solid Phase Peptide Synthesis).
Capping Reduces Complexity
Synthesis and subsequent purification of polypeptides can be facilitated by a strategy involving <<capping>> and <<affinity isolation>>, both of which are now explained. By driving couplings to completion (quantitatively <<capping>> the last trace of free amine with a high concentration of a powerful and unhindered reagent such as acetic anhydride), crude product complexity is reduced as the capped chains are terminated and cannot give rise to further (exponential) complexity through deletions during further cycles. In the final cycle, after capping of this cycle as after every previous cycle, the last residue to be added is to be found uniquely on full-length material, not on the capped (truncated, terminated) chains. This capping (with an irreversible acyl group such as acetyl) does not increase yield, which remains at 37% for 100 steps at 99%, but it drastically cuts the complexity of the impurity profile. If capping achieves complete termination of deletion chains at each cycle, the final product is contaminated with 101 impurities (the most abundant of which is present at 1% and arose at the first cycle, and the least abundant is present at 0.37% and arose at the 100th cycle) instead of 1020-1030. This approach (capping) to the problem of deletions arising from incomplete couplings is well known (Methods in Enzymology, vol 289).
Isolation Relying on Special Properties of the Amino-Terminal Residue
Solid phase synthesis of long polypeptides proceeds from C-terminus (attached to the resin via a linker) to N-terminus. When isolating the full-length polypeptide from the capped impurities (truncated chains) present, it is better to rely on unique properties (all or nothing) of the N-terminal residue or group than to try a brute-force separation based on general factors such as size, hydrophobicity, charge, etc., which do not differ greatly between one long polypeptide and another. An approach to isolation which relies on the properties of the N-terminal residue is also useful when isolating recombinant DNA-derived polypeptides or polypeptides from natural sources.