The science of proteomics seeks to efficiently quantify, identify and characterize the large number of proteins/peptides that characterize a biological system. Two approaches to proteomics are being pursued today. One focuses on cataloging all the proteins in a biological system, the cellular components with which each of these proteins interact, the pathways of which they are a part, and the location in which they reside. The other strategy, known as “comparative proteomics” is based on a comparison of proteins in a biological system in two different states. Disease and a wide range of other stimuli cause biological systems to pass into a new, chemically distinct state distinguished by changes in the occurrence and amount of specific proteins. Comparing samples taken from organisms in the normal and an altered state can be used to recognize proteins involved in the transition.
Although “proteomics” is a recently coined term, comparative proteomics has actually been practiced for at least three decades. For example, early studies used 2-D gel electrophoresis to examine thousands of proteins in blood. These reports, and many since, depended on resolution of complex protein mixtures with 2-D gels and differential staining to compare samples and recognize differences. This classical method is labor intensive, quantification is poor, and it is difficult to identify spots thought to be important.
With the advent of huge DNA and protein sequence databases there is an increasing dependence on mass spectrometry to identify proteins and peptides obtained from separation systems. The speed and resolution of mass spectrometers is shifting the focus in proteomics toward more rapid delivery of peptide mixtures to mass spectrometers and obtaining quantitative as well as qualitative data during mass analysis.
Stable isotope coding strategies are of great value for distinguishing changes in comparative proteomics. These coding techniques may be broadly characterized as internal standard methods in which components from control samples are derivatized with an isotopically distinct coding agent, mixed with experimental samples, then used as standards for determining the relative concentration of components in experimental samples derivatized with a different isoform of the coding agent (PCT WO 01/86306, published Nov. 15, 2001, Ji et al., J. Chromatogr. B, (2000) 745, 197-210). Most of the coding agents used today are labeled with deuterium, and relative concentration measurements are based on isotope abundance ratio determinations with either matrix assisted laser desorption ionization-mass spectrometry (MALDI-MS) or electrospray ionization-mass spectrometry (ESI-MS).
As with all internal standard methods, it is important that the behavior of analytes and standards be as nearly alike as possible before the final step of abundance ratio measurement. Ideally, segregation would occur only in the final step during quantification. The attractive feature of creating internal standards through isotopic labeling is that discrimination is minimized, particularly when the internal standard and analyte vary by a single heavy atom.
The problem with current stable isotope coding methods for proteomics is that as the number of deuterium atoms is increased to enlarge the mass difference between isotopically coded standards and analytes, there is a corresponding increase in chromatographic resolution of the isotopic isoforms, particularly in the case of reversed-phase chromatography (Zhang et al., Anal. Chem., 2001, 73, 5142-5149). As a result, the concentration ratio of isoforms varies continuously across the elution profile of the two components.
This “isotope effect” has a number of undesirable consequences. Because of the isotope effect caused by deuterium in chromatographic separation, the accuracy of abundance ratio measurement is greatly compromised, particularly in MALDI-MS. In online ESI-MS analysis, there is a serious trade-off between accurate quantification and MS/MS peptide sequencing because of the deuterium isotope effect.
For example, using existing methods (e.g., abundance ratio measurements) to determine a relative change in concentration of components from a single mass spectrum is not possible. Instead, relative concentration must be obtained by a comparison of area measurements between integrated extracted ion chromatograms in the case of ESI-MS or from eluate fractions with MALDI-MS. High quality MS/MS data of peptides are crucial to the reliability of protein identification and characterization. Unfortunately, this requires certain tuning and ion accumulation. In another word, it takes time. It is very much desired to selectively fragment the peptides of interest.
In the case of ESI-MS, the analyte peak must completely elute from the LC-MS system before it can be determined whether the peptide has changed in concentration and is therefore of interest (Griffin et al., Journal of the American Society for Mass Spectrometry, 2001, 12, 1238-1246). This effectively precludes the use of on-line, intelligent data acquisition and analysis (IDA), also referred to as real-time data dependent analysis (DDA). Either a second chromatographic run is required after the abundance ratio is determined in order to selectively perform MS/MS analysis on the peptides of interest, or the mass spectrometer has to acquire MS/MS data on every peptide to be sure data has been acquired on peptides that changed in concentration. In either case, instrument and computer time are wasted.
Feedback control software is available in some commercially available mass spectrometers. The most abundant peaks are typically selected for further analysis on-line in “real time”. The most abundant peaks, however, do not necessarily coincide with analytes that are significantly up or down regulated, which in comparative proteomics are more often the analytes of interest. In other commercially available instruments, LC/MS is run twice, first to obtain abundance ratios and second to selectively fragment peptides based on the abundance ratios. This method may yield information on analytes that are up or down regulated, but has a major drawback in that abundance ratio based feedback control is not performed in “real time” since two chromatographic runs are used.
Reconstruction of extracted ion chromatograms is really only possible with ESI-MS. Reconstruction of peaks from MALDI-MS is very difficult in complex mixtures unless 10-60 fractions are collected across each peak. Anywhere from 2,000-10,000 fractions would have to be collected and analyzed by MALDI-MS to reconstruct extracted ion chromatograms for the thousands of peptides encountered in a single reversed phase chromatographic separation. This is so cumbersome that MALDI-MS by any approach but continuous deposition is essentially precluded when quantification accuracy is an issue.
Integration of peak areas is even more difficult and inaccurate when isotopically labeled peptides are fractionated in one or more of the early steps in a multi-dimensional separation experiment. For example, isotopically labeled peptides could be separated in ion exchange chromatography followed by reversed-phase chromatography, or reversed-phase chromatography followed by ion mobility separation.
Further complications can arise when ionization efficiency of the isoforms varies with time in ESI-MS or between fractions in MALDI-MS. Suppression of ionization between peptides has been noted in ESI-MS when total peptide concentration is high, as when one peptide is eluting in a large background of another. This means that ionization efficiency can vary across a chromatographic peak. In MALDI-MS, the chromatographic fractions used for spectral analysis may be enriched in one isotopic isoform over the other and may differ widely in matrix components. The peptide isoforms could be suppressed to very different degrees when they are not eluted simultaneously and thus ionized with a different matrix (i.e., with other co-eluting peptides). These effects compromise abundance ratio measurements by potentially producing both significant systematic errors and a higher level of random errors (Zhang et al., Anal. Chem. 2001, 73, 5142-5149), even when relative peak areas are estimated with extracted ion chromatograms.
Finally there is the most serious problem of all. When the isotopic isoforms are completely resolved, as is frequently the case with labeled peptides that contain many deuterium atoms, whether or not the two peaks are related cannot be determine unless the peptides are sequenced or mass is measured with very high mass accuracy. This leads to the erroneous conclusion that the singlet cluster seen in one of the resolved peaks is representative for the peptide and that it has undergone a major change in concentration.
Clearly, minimizing chromatographic resolution of analyte isoforms would improve measurement accuracy and enable real-time measurements.