The ability to identify proteins and determine their chemical structures has become central to the life sciences. The amino acid sequence of proteins provides a link between proteins and their coding genes via the genetic code, and, in principle, a link between cell physiology and genetics. The identification of proteins provides a window into complex cellular regulatory networks.
Mass spectrometry (MS), including but not limited to triple quadropole and ion trap mass spectrometers, is among the most widely used platforms for molecular analysis and identification—spanning natural products, pharmaceuticals and biologics. Most mass spectrometer-based experiments begin with the isolation of a group of compounds from a set of samples through some sort of extraction technique, such as extraction of proteins from tissues, cell lysates or fluids followed by proteolytic digestion of those proteins into peptides (i.e., bottom-up proteomics). Frequently, but not necessarily, mass spectrometers are coupled with some form of separation, such as electrophoretic or chromatographic separation systems. Over the course of just a few hours, mass spectral instruments can autonomously interrogate tens of thousands of molecular species via tandem mass spectrometry (MS/MS).
Quantitative analysis in chemistry is the determination of the absolute or relative abundance of one, several, or all particular substance(s) present in a sample. For biological samples, quantitative analysis performed via mass spectrometry can determine the relative abundance of peptides and proteins. The accepted methodology for performing mass spectrometric quantitation is accomplished using a mass spectrometer capable of MS/MS fragmentation (i.e., triple quadropole or ion trap spectrometers). The quantitation process can involve isobaric tagging of peptide precursors, which when combined with post-acquisition software, provides the relative abundance of peptides. However, when a peptide precursor is selected for tandem mass spectrometry, there are often interfering species with similar mass-to-charge ratios that are co-isolated and subjected to activation. These species are often other isobarically tagged peptides with different relative quantitation, which therefore disturb the quantitative measurement of the peptide of interest.
As a result, protein identification technologies have rapidly matured such that constructing catalogs of the thousands of proteins comprised by a cell using mass spectrometry is now relatively straightforward [de Godoy, L. M. F. et al. Nature 455, 1251-1255 (2008); Swaney, D. L., Wenger, C. D. & Coon, J. J. J. Proteome Res. 9, 1323-1329 (2010)]; however, knowing how the abundance of these molecules change under various circumstances is not [Ong, S. E. & Mann, M. Nat. Chem. Biol. 1, 252-262 (2005)]. Stable isotope labeling by amino acids in cell culture (SILAC) provides a means to make binary or ternary comparisons [Jiang, H. & English, A. M. J. Proteome Res. 1, 345-350 (2002); Ong, S. E. et al. Mol. Cell. Proteomics 1, 376-386 (2002)]. By interlacing these two- or three-way experiments, higher-order comparisons can be obtained [Olsen, J. V. et al. Sci. Signal. 3, ra3 (2010)]. Such large-scale multiplexed experiments are invaluable, as they (1) allow measurement of time-course experiments, (2) permit collection of biological replicates, and (3) enable direct comparison of transcriptomic and proteomic data.
Constructing this type of multi-faceted proteomics study, however, is an arduous undertaking and has only been accomplished in a handful of experiments by an even smaller group of researchers. The first impediment is the requirement to grow multiple groups of cells with various labels. This step is actually less limiting than the second major obstacle: each binary or ternary set must be analyzed separately. When combined with the need for extensive pre-MS fractionation and technical replicates, a large-scale experiment via SILAC demands three to six months of constant instrument usage.
Isobaric tagging [Thompson, A. et al. Anal. Chem. 75, 1895-1904 (2003); Ross, P. L. et al. Mol. Cell. Proteomics 3, 1154-1169 (2004)], is an elegant solution to this problem, allowing relative quantification of up to eight proteomes simultaneously [Choe, L. et al. Proteomics 7, 3651-3660 (2007); Dayon, L. et al. Anal. Chem. 80, 2921-2931 (2008)]. Further, it is compatible with mammalian tissues and biofluids, unlike metabolic approaches. Despite its potential to enable fast, multiplexed quantitative proteomics, isobaric tagging has not been widely embraced for large-scale studies [Lu, R. et al. Nature 462, 358-U126 (2009)]—chiefly because of precursor interference. This problem does not exist for SILAC because abundance measurements are performed with high-resolution MS1 analysis in tandem mass spectrometry. Even for very complex samples having tens or hundreds of co-eluting peptides, high-resolving power mass analyzers can easily distinguish the target from neighboring peaks less than 0.01 Th away.
In the isobaric approach, however, the target peptide is isolated at much lower resolution, typically 1-3 Th, and dissociated to produce reporter tags. Therefore, the quantitative signal in the reporter region is compiled from every species in the isolation window [Ow, S. Y. et al. J. Proteome Res. 8, 5347-5355 (2009)]. For highly complex mixtures, like those analyzed in large-scale experiments, co-isolation of multiple species is the rule, not the exception (vide infra). This problem erodes quantitative accuracy, as measured ratios tend to be compressed toward the median ratio of 1:1, and thus has restricted isobaric tagging to applications with lower sample complexity.
Isobaric labeling, such as iTRAQ and other types if isobaric tagging reagents, is an important quantitative method as it allows for multiplexing and is directly applicable to clinical samples. A significant source of error, however, occurs when another eluting peptide ion has a m/z value that is very near that of the selected precursor (˜50%, in many experiments). The result is the isolation of both species, which are consequently co-dissociated, to produce a composite MS/MS spectrum. The resulting reporter ion ratios do not accurately reflect the relative abundances of either peptide; limiting both the precision and dynamic range of quantitation, as the median peptide ratio is close to 1:1.
The increasing popularity of iTRAQ for quantitative proteomics applications has spurred increased efforts to evaluate its relevance, accuracy, and precision for biological interpretation. Recently, some researchers have begun to assess the accuracy and precision of iTRAQ quantification as well as drawbacks which hinder the applicability and attainable dynamic range of iTRAQ. Some results suggest that crosstalk between interfering factors can result in underestimations. [Ow et al., “iTRAQ Underestimation in Simple and Complex Mixtures: ‘The Good, the Bad and the Ugly’”, Journal of Proteome Research, web publication Sep. 16, 2009]. It is clear that there is tantalizing potential for iTRAQ and other protein labeling methods to provide accurate quantification spanning several orders of magnitude. This potential can be limited, however, by several factors. First, for example, the existence of isotopic impurities often requires correction of mass spectral data to provide accurate quantitation which currently requires the availability of accurate isotopic factors. Second, the interference of mixed MS/MS contribution occurring during precursor selection is a problem that is currently very difficult to minimize.
What is needed is a method of improving the accuracy of mass spectrometry analysis and quantification of samples, particularly samples labeled with isobaric tags.