Collective developments in mass spectrometry, separations, enrichment techniques, and related sample processing and data analysis have shifted the experimental focus of proteomics applications from simple protein catalogs to construction of dynamic networks, whereby changes in protein expression and post-translational modification status are monitored as a function of biological state (disease, oxidative stress, etc.) or perturbation (injury, drug treatment, etc.). The ability to quantitatively monitor proteins directly, rather than biological surrogates such as mRNA, and then use these data in models that support predictions in the context of cellular physiology will have a profound impact on human health.
Although quantitative proteomics is still considered an emerging technology with respect to instrumentation and standardization general trends have nonetheless emerged. For example, comparison of cellular proteomes was traditionally performed with two dimensional gel electrophoresis (2DGE) (Gygi, S. P., et al. Proc Natl Acad Sci USA 97, 9390-9395 (2000); O'Farrell, P. H. J Biol Chem 250, 4007-4021 (1975)). Despite high resolving power, this technique suffers from limited dynamic range, incompatibility with membrane and basic proteins, and low throughput (Baggerman, G., et al. Comb Chem High Throughput Screen 8, 669-677 (2005); Wolff, S. et al. Mol Cell Proteomics 5, 1183-1192 (2006)). In addition, relative quantification of proteins is usually performed with image analysis software, and hence gel-to-gel variability can yield unacceptably high errors. Recently, LC-MS/MS approaches that utilize stable isotope dilution have been developed for quantitative proteomics, and these methods are collectively displacing 2DGE as the technique of choice for comparison of protein expression and post-translational modification status (Bantscheff, M., et al. Anal Bioanal Chem 389, 1017-1031 (2007); Pan, S. et al. Methods Mol Biol 367, 209-218 (2007); Gevaert, K. et al. PROTEOMICS 8, 4873-4885 (2008)). Proteins or peptides can be labeled with stable isotopes of hydrogen, oxygen, carbon, and/or nitrogen. After labeling, samples are mixed and analyzed by LC-MS/MS and relative abundances determined from the measured ratios of peptide precursors (MS scan) or fragments (MS/MS scan).
As reviewed recently by Pan and Aebersold, stable isotope labeling schemes for quantitative proteomics may be conveniently divided into three classes: (i) metabolic, (ii) enzymatic, and (iii) chemical (Pan, S. et al. Methods Mol Biol 367, 209-218 (2007)). The former is typically referred to as SILAC or stable incorporation of labeled amino acids in culture (Ong, S. E. et al. Mol Cell Proteomics 1, 376-386 (2002); Veenstra, T. D., et al. J Am Soc Mass Spectrom 11, 78-82 (2000)). In this strategy, cells are cultured in normal media or media that contains amino acids enriched with stable isotopes of carbon, nitrogen, or oxygen. After several passages, heavy amino acids are metabolically incorporated into cellular proteins at a level >95%. Because light and heavy cell cultures are combined just prior to lysis, no bias due to sample handling is introduced during subsequent processing steps. Relative quantification is based on measured ratios of peptide precursor abundances in MS scans. Recent reports have extended the concept of metabolic labeling to facilitate quantitative analysis of proteins in animal models (Krüger, M. et al. Cell 134, 353-364 (2008); McClatchy, D. B., et al. J Proteome Res 6, 2005-2010 (2007); McClatchy, D. B., et al. Genome Res 17, 1378-1388 (2007)). Enzymatic incorporation of stable isotopes during protein digestion is another strategy commonly used for quantitative proteomics (Mirgorodskaya, O. A. et al. Rapid Commun Mass Spectrom 14, 1226-1232 (2000); Yao, X., et al. Anal Chem 73, 2836-2842 (2001)). In this approach protein samples are digested in buffers formulated in either normal (H216O) or heavy (H218O) aqueous solutions. Hydrolysis of amide bonds at lysine and arginine (in the case the enzyme trypsin) leads to incorporation of oxygen atoms from water at newly formed peptide C-termini. Relative quantification is based on measured ratios of peptide precursor abundances in MS scans. Unfortunately incorporation of 18O is often incomplete, leading to the appearance of doublets that contain one or two 18O atoms, respectively, in peptides digested in heavy water.
Chemical labels are synthesized de novo to meet specific physiochemical characteristics, and hence represent the most versatile class of compounds for quantitative proteomics. These reagents can be broadly characterized by their (i) target site for derivatization, (ii) elemental composition of heavy isotopes, (iii) impact on peptide gas phase basicity, (iv) incorporation of affinity tag for enrichment, (v) compatibility with, and quantitative readout in, MS and MS/MS scans, and (vi) compatibility with typical sample processing protocols in proteomics. Each of these properties must be carefully considered during the design of a labeling strategy to maximize analytical figures of merit for the final reagent. For example, the original ICAT reagent targeted free thiol groups in cysteine side chains and included a biotin affinity tag for efficient enrichment of cysteine-containing peptides. However, stable isotopes of deuterium constituted the mass tag, and hence the light and heavy labeled peptides did not co-elute under typical reversed phase chromatographic conditions used in LC-MS/MS. In addition, the relatively large linker scaffold was prone to fragmentation under typical MS/MS conditions, complicating interpretation of peptide sequence data. The ICAT reagent was subsequently re-designed to include 13C as the mass tag and a cleavable linker to improve overall performance, although in general this reagent is only applicable to quantification of cysteine-containing proteins (Li, J., et al. Mol Cell Proteomics 2, 1198-1204 (2003)).
Similarly, a host of reagents have been developed that target peptide N- and C-termini; details are thoroughly discussed in a recent review by Leitner and Lindner (Leitner, A. & Lindner, W. J Chromatogr B Analyt Technol Biomed Life Sci 813, 1-26 (2004)). For example, carboxyl-directed tags will target the side chains of acidic amino acids, in addition to the peptide C-terminus. In practice, the ubiquitous use of trypsin for digestion, results in peptides with an indeterminate number of acidic residues, and hence labeling sites. Conversely the use of reagents that target primary amines generally yields peptides with three or fewer labels. However, the activated esters that are commonly used to target primary amines are unstable in aqueous conditions, and hence use of these reagents often requires organic solvents that must be removed via lyophilization or vacuum centrifugation prior to LC-MS analysis. In addition many of these compounds act as acetylating reagents, reducing the gas phase basicity of peptide primary amines, and leading to a concomitant reduction in ionization efficiency. Quartenary amines can also be used to affix a permanent charge at primary amines; but often these reagents adversely affect fragmentation of peptides under low-energy MS/MS conditions, leading to a reduction in peptide and protein identification.
Despite the limitations described above, the physiochemical properties of small molecule labels can in principle be fine-tuned for optimum performance in quantitative proteomics. The ideal reagent would selectively target peptide side chains or termini, provide rapid and complete derivatization, maintain overall gas phase basicity of peptides, introduce at least a 4-Da mass difference via stable isotopes that do not inductively shift peptide chromatographic elution time with respect to the light counterparts, support labeling of proteins or peptides derived from a wide range of biological samples, and maintain peptide fragmentation patterns typically observed with low-energy MS/MS activation schemes.