The sequencing of the human genome has created the promise and opportunity for understanding the function of all genes and proteins relevant to human biology and disease, Peltonen and McKusick, Science, 291: 1224-1229 (2001). However, several important hurdles must be overcome before this promise can be fully attained. First, even with the human genome sequence available, it is still difficult to distinguish genes and the sequences that control their expression. Second, although monitoring gene expression at the transcript level has become more robust with the development of microarray technology, a great deal of variability and control of function originates in post-transcriptional events, such as alternative splicing and post-translational processing and modification. Finally, because of the scale of human molecular biology (about a third of the estimated 30-40 thousand genes appear to give rise to multiple splice variants and most appear to encode protein products with a plethora of post-translational modifications), potentially many tens of thousands of genes and their expression products will have to be isolated and tested in order to understand their role in health and disease, Dawson and Kent, Annu. Rev. Biochem., 69: 923-960 (2000).
In regard to the issue of scale, the application of conventional recombinant methodologies for cloning, expressing, recovering, and isolating proteins is still a time consuming and labor-intensive process, so that its application in screening large numbers of different gene products for determining function has been limited. Recently, a convergent synthesis approach has been developed which may address the need for facile access to highly purified research-scale amounts of protein for functional screening, Dawson and Kent (cited above); Dawson et al, Science, 266: 776-779 (1994). In its most attractive implementation, an unprotected oligopeptide intermediate having a C-terminal thioester reacts with an N-terminal cysteine of another oligopeptide intermediate under mild aqueous conditions to form a thioester linkage which spontaneously rearranges to a natural peptide linkage, Kent et al, U.S. Pat. No. 6,184,344. The approach has been used to assemble oligopeptides into active proteins both in solution phase, e.g. Kent et al, U.S. Pat. No. 6,184,344, and on a solid phase support, e.g. Canne et al, J. Am. Chem .Soc., 121: 8720-8727 (1999) and U.S. Pat. No. 6,326,468. Recently, the technique has been extended to permit coupling of C-terminal thioester fragments to a wider range of N-terminal amino acids of co-reactant peptides by using a removable ethylthio moiety attached to the N-terminal nitrogen of the co-reactant, thereby mimicking the function of an N-terminal cysteine, Low et al, Proc. Natl. Acad. Sci., 98: 6554-6559 (2001).
Unfortunately, when the polypeptide to be synthesized by this approach exceeds 100-150 amino acids, it is usually necessary to join three or more fragments, as it is currently difficult to synthesize and purify oligopeptide intermediates longer than about 60 residues. In this case, the internal oligopeptide intermediates not only contain a C-terminal thioester moiety, but also an N-terminal cysteine. During the assembly process, the cysteine or cysteine-mimic of such internal intermediates, if left free, will react with the C-terminal thioester of the same intermediate molecule or that of a different intermediate molecule, thereby interfering with the desired ligation reaction by the formation of an undesired cyclical peptide or concatemer of the intermediate. This problem can be circumvented by employing a protecting group for the N-terminal amino acid with the following properties: i) it must be stable to the conditions used to synthesize and cleave the oligopeptide from the synthesis resin, ii) it must be removable after a native chemical ligation has been completed, and iii) preferably, removal takes place in the same ligation reaction mixture before purification, so that the ligation reaction and amino acid deprotection can be conducted in one pot.
The extension of the chemical ligation methodology by the use of auxiliary groups on the N-terminal amino acids of peptide reactants has given rise to a need for the effective protection groups for this class of reagents as well as methods for their synthesis and use.