For the analysis of a complex protein sample (a tissue sample or a body fluid like serum, or urine) by proteomic methods, the sample is generally cleaved into peptides, the peptides are separated and analysed by Mass Spectrometry. The biggest problem to be overcome in such a “shotgun” approach is the reduction in complexity of the mixture of cleaved peptides. By digesting the proteins with proteases in order to provide fragments suitable for the analysis by currently used mass spectrometry instrumentation (optimal analysis range of these machines is in the order of 2000-4000 m/z), the number of molecules to be analysed is dramatically increased compared to the original sample. As different peptides in the mixture originate from the same parent protein upon cleavage, the analysis of all of these peptides may in some cases be redundant. On the other hand, the analysis of several peptides corresponding to the same protein can serve as a further confirmation in the identification of the parent protein.
In some analyses however, it is not necessary that all peptides originating from a protein are analysed. One or a few peptides are often sufficient to identify the presence of a certain protein in a sample. To reduce the number of peptides after cleavage, different functional groups of amino acids have been used to selectively isolate or label specific peptides. Functional groups which have been described as suitable for this purpose are the thiol group of Cysteine, the carboxyl group of Aspartic acid, Glutamic acid and the carboxyterminus, the amine groups of Lysine and the aminoterminus of peptides themselves.
Cysteine is often used as functional group in proteomics for the selective isolation or labelling of peptides. It occurs in about 85% of all proteins and on average occurs with a frequency of about 2 to 3% within a polypeptide. Accordingly, analysis of only the cysteine-comprising peptides of a protein sample have been considered as adequately representative of the entire protein pool. Furthermore, the thiol group of Cysteine does not occur in any other amino acid, allowing specific tagging or labelling. One disadvantage however is that Cysteine residues within a protein often form disulfide bridges, which contribute to the tertiary or quaternary structure of a protein and thus are often present in pairs. The use of Cysteine for the labelling of peptides will thus often label at least two peptides from the same protein, resulting in the generation of redundant information. Moreover, Cysteine residues are often located in domains of a protein which contribute to the structure of the protein. Such domains are often conserved between proteins with different function. Accordingly, unequivocal identification of a protein based on a Cysteine-comprising peptide thereof may be difficult. A further disadvantage of Cysteine is that, though the occurrence of Cysteine in proteins is relatively high, its distribution is somewhat uneven. While numerous Cysteine-rich proteins exist, not all proteins contain Cysteine (e.g. ribosomal proteins). Thus some proteins will be overlooked by Cysteine-directed tagging or labelling while others will be excessively represented. Finally, during manipulation of a sample, Cysteine residues may become oxidised. Such oxidised amino acids can no longer be used for thiol-specific tagging or labelling.
The imidazole group of Histidine, another unique functional group in proteins, is widely used for affinity chromatography, based on its inherent affinity for metal, but has been rarely used as a target for protein modification.
Histidine-tagged biomolecules, typically comprising a tail of 6 Histidine residues are purified by immobilised metal ion affinity chromatography (IMAC). Proteins with an artificially added or naturally occurring sequence comprising multiple Histidines, Cysteines or Tryptophanes in the correct configuration can be bound to a column matrix containing covalently bound chelated metal ions. Most commonly, Ni chelate chromatography is used as a matrix in affinity chromatography to purify recombinant proteins that have been expressed as fusion proteins with one or more His6 tags at the N- or C-terminus of the protein. Typically, the nickel ions are attached to the column matrix via nitrilotriacetate groups and interact with Histidine residues in the tagged protein in exchange for water. Elution is brought about either using a gradient of increasing imidazole concentration, or in stepwise procedure. Apart from Ni2+ other metal ions are of interest for IMAC such as Cu2+, Zn2+, Co2+, Fe3+, Hg2+.
The use of IMAC for the selection of Histidine-containing peptides from tryptic digests is reviewed in Mirzaei & Regnier (2005) J. Chrom. B 817, 23-34. The use of arylboronic acid to couple diagnostic shells to antibodies via the imidazole group is described in WO2006064451, which is based on a copper catalysed reaction described in Collman et al. (2001) J. Org. Chem. 66, 1528-1531 (see FIG. 1). Boronic acid is known in the medical field as a reagent which covalently binds to cis-diol groups of sugars above pH 8.0. The use of Boronic acid-modified peptides is described for the inhibition of serine proteases.