The covalent site-specific modification of proteins is a crucial process in biological systems. Enzymatic modifications (such as phosphorylation, glycosylation, sulfation, acetylation, methylation, isoprenylation, ubiquitination, sumoylation, neddylation) play a defining role in cellular processes including protein localization and trafficking, signal transduction, transcriptional regulation, and targeted protein destruction. Reproducing natural modifications on the surface of proteins is invaluable for studying their function.
The site-specific covalent addition of ‘unnatural’ moieties, such as fluorophores, affinity labels, spin-label probes, radiolabels and other (bio-orthogonal) functional groups, to proteins and peptides has also proven useful for a vast variety of applications and processes both in vivo and in vitro.
With the advances in biocompatible synthetic organic chemistry, a whole new field of opportunity has opened up, affording high diversity in the nature of the ligation components as well as a choice of ligation reactions. Chemical ligation involves the chemoselective covalent linkage of a first chemical component to a second chemical component. Unique, mutually reactive functional groups present on the protein or peptide and on the second ligation component can be used to render the ligation reaction chemoselective. Developing robust methods for selectively modifying proteins or peptides is however still quite a challenge, due to the variety and number of functionalities present in a typical protein. There is, in general, a desire or need to modify the protein site-specifically to optimally combine the conjugated functionality with the intrinsic properties of the protein or peptide. For controlled, selective access to such modified proteins, a unique, bio-orthogonal, chemical handle or attachment site is thus often required for ligation or conjugation of a desired molecule thereto.
Most methods to produce bioconjugates exploit the nucleophilicity of either amines (lysine side chains and the N-terminus) or thiols (cysteine side chains) on the protein surface. For site-specific modification of proteins methods have been proposed that include designing mutants having all of the lysine residues of a protein replaced or a mutant having all but one of the cysteine residues replaced, in order to limit the modification site to a single location.
Activated acids such as N-hydroxysuccinimidyl-esters can be used to target amines; however, proteins typically contain multiple amines and, even with the reduced pKa of the amino-terminus of the protein compared to the lysine side chain, this labelling is generally nonspecific. To overcome this problem, researchers have gone as far as evolving the protein target to eliminate all surface lysines. Researchers performed phage display selections on tumour necrosis factor-α (TNF-α) to evolve a protein without any lysines, allowing for the site-specific PEGylation of the single remaining amine at the N-terminus. The evolved TNF-α with modification at just the N-terminus has improved stability and bioactivity compared to the randomly labelled wild-type TNF-α. In general, large-scale amino acid-substitution that replaces a given type(s) of amino acid at most of all sites has been associated with the drawback of reduced protein activity.
Cysteine labelling is typically more specific than amine labelling as the thiol group is more nucleophilic and a single cysteine can be introduced by site-directed mutagenesis without affecting the function of the protein. Cysteines introduced by mutagenesis are labelled with small molecule-linked maleimides, α-haloketones, or other electrophilic groups that preferentially react with thiols. However, cysteine-based methods alone cannot easily perform multiple modifications to a protein (such as introducing two fluorophores for fluorescence resonance energy transfer (FRET) analysis) except when the reactivity of the two cysteines is quite different, and not all protein targets allow for the introduction of cysteines without impairing function.
Further methodologies that target additional functionalities present in the natural amino acids are considered to be valuable additions to the chemist's toolbox for performing protein modification.
Francis and co-workers (J. Am. Chem. Soc. 2004, 126, 10256) have identified reactions that can specifically target both tryptophan and tyrosine residues. Using rhodium carbenoids, the selective modification of the indole functionality in tryptophan (at N-1 or at C-2) can be achieved. When applied to either myoglobin or subtilising Carlsberg (with 2 and 1 surface tryptophans, respectively), modification occurs exclusively at the tryptophan residues in about 50-60% yield. While the system requires a co-solvent to solubilize the organic reagents, this co-solvent (ethylene glycol) is not expected to denature the protein targets. A more significant limitation of the chemistry is the requirement of extremely low (1.5-3.5) pH that causes denaturation of most proteins as well as loss of myoglobin's heme group. While methods to reconstitute myoglobin could recover that protein's structure, these extreme conditions will be incompatible with many targets. The results do suggest, however, that transition metal complexes can be used for specific bioconjugation of aromatic residues in proteins.
Francis and co-workers (J. Am. Chem. Soc. 2004, 126, 15942) have also described a Mannich-type reaction to selectively target tyrosines using an aldehyde and an aniline at pH 6.5. By attaching a fluorophore to the aniline, conjugation of rhodamine to chymotrypsinogen A is achieved without altering the protein's activity. The relative surface accessibility of different tyrosines allows for selectivity for particular tyrosine residues; for chymotrypsinogen A, one of the three surface tyrosines is preferentially targeted.
Some strategies make use of functional groups that do not occur naturally. Such amino acids can be introduced by methods including total synthesis, semi-synthesis by ligation of synthetic and biologically expressed fragments, transfer reactions, etc. The presence of these non-naturally occurring functional bio-orthogonal groups enables individual targeting without inadvertent cross-linking or other reactions with naturally occurring amino acid side chain groups.
An example includes the controlled introduction of azides in proteins as the bio-orthogonal ligation handle. A reaction that has proven useful in this regard is the Sharpless modified Huisgen cyclization of an alkyne and an azide. In the presence of catalytic amounts of Cu(I), this cyclization occurs chemoselectively to produce a 1,4-disubstituted triazole ring. Cross reactivity with common biological functional groups is not seen for either the alkyne or the azide, and both groups are stable under biological conditions. This reaction has been used for applications ranging from the modification of viral particles and proteins on cell surfaces to the identification of carbohydrate-binding proteins and cellular proteins tagged by small-molecule activity probes.
As will be clear from the above, despite the extraordinary interest in selectively modified proteins, each of the techniques currently available suffers from specific draw-backs. Over and above that, there is quite a lack of generally usable and flexible methods. Hence, additional chemical strategies for providing selective (bio-orthogonal) ligation handles are still needed to complement and expand upon methods from biology. It is the objective of the present invention, to provide such strategies.