The combination of chromatographic separation and mass spectrometric detection holds a central position in the analysis of complex biological mixtures. Survey analyses in which all components having the analytically detectable characteristics are sought are becoming increasingly common. They are of value in proteomics, metabolomics, and pharmaceutical studies, to name a few. However, the range of component concentrations that can be distinguished in a single chromatographic run depends on the number of components in the sample detectable by the means employed, the peak capacity of the chromatogram, and the dynamic range and discriminating power of the detector. The enhancement of peak capacity is one of the primary research goals in chromatography. In the case of complex natural samples such as breath, blood serum, or urine, the number of components of interest exceeds currently achievable peak capacities by many orders of magnitude. Researchers estimate that all components with a response less than ˜1% of the most abundant component will not be observed. These unresolved components produce minor detector responses widely spread throughout the chromatogram to produce a background signal often referred to as “chemical noise.” Peak capacity can be increased through the use of multichannel detection such as the separate mass-to-charge (m/z) values in mass spectrometry. With multiple channels of detection, components that co-elute can be separately detected, thus increasing peak capacity. The addition of this increased discrimination can reduce the number of unresolved components thereby extending the concentration range of detectable components by another order of magnitude or so. However, to detect minor components co-eluting with major components, it is necessary to have a wide dynamic range for each channel of detection. Most mass spectrometers multiplex the m/z channels using the same detector at different times. This makes the dynamic adjustment of gain on each channel difficult to achieve. Furthermore, being able to detect only the components above say 0.1% of the most abundant component is a debilitating limitation for many areas of biomedical research. The reason for this limitation is fundamental and therefore requires a breakthrough in technology to solve.
In 1983, Joe Davis and Cal Giddings quantified the degree of peak overlap that would occur during chromatographic separation of a complex mixture, assuming a random retention time for each component. The Poisson statistics they employed had earlier been reliably used to predict peak overlap in photon and ion counting and in the detection of nuclear events. The results of the Giddings study were quite remarkable: in a chromatogram of 50 components in which there would be room for 100 distinguishable peaks (peak capacity of 100), only 18 of the components would not suffer overlap from others. Doublets (7), triplets (3) and even a quadruplet were predicted. They further stated that, “ . . . a chromatogram must be approximately 95% vacant in order to provide a 90% probability that a given component of interest will appear as an isolated peak.” This early paper has received 170 citations since its publication. Davis and others have gone on to demonstrate its validity and to refine and extend the theory of peak overlap (Davis 1997; Davis 1999).
A major difference between the response of components in a chromatogram and the response of ion, photon, or gamma ray detectors is that the components in a natural sample have a range of responses as a result of differences in component sensitivity and concentration (F. Dondi 1997). Thus, in later analyses, workers have come to refer to “detectable peaks” rather than “number of components”. Statistical analysis (Davis 1994; Davis 1997; Dondi, Bassi et al. 1998) and Fourier transform analysis (Felinger, Pasti et al. 1990; Felinger, Pasti et al. 1991; Felinger, Vigh et al. 1999) have been used to predict the number of detectable peaks in complex chromatograms. In one recent comparison of these approaches, (Fellinger and Pietrogrande 2001), a chromatogram of diesel fuel showed 180 clearly identifiable peaks. Since the chromatogram is essentially filled with peaks, the peak capacity must be on the order of 200. Statistical and Fourier transform analysis project that the number of detectable peaks in the sample is 244 and 242, respectively. This is only a tiny fraction of the actual number of components in the sample.
Attempts to deconvolute overlapped peaks of single-detector chromatograms into their separate components by mathematical means cannot get beyond the modest improvement indicated by this diesel fuel example. Acknowledgement of this fact has led chromatographers to devise methods to increase chromatographic peak capacity. The method providing the greatest improvement is 2-d chromatography, in which fractions from the first chromatogram are then chromatographed again with a different type of stationary phase. Depending on how different the selectivity criteria are between the stationary phases, the resulting effective peak capacity can be as much as the product of the individual peak capacities. The majority of authors citing the Davis/Giddings paper do so to justify the need for 2-d chromatography. This approach is most often used with only selected sections of the first chromatogram, because a full 2-d chromatogram can take many hours or even days to perform. Alternatively, investigators gain concentration range through a variety of prior sample separation steps (extraction, absorption, etc.) to remove the most abundant components (e.g, albumin, ubiquitin, and other abundant proteins in biological samples). Problems with this latter approach include loss of time, increased required expertise of the operator, and the potential for losing some of the minor components that get trapped with the major components. The foreseeable methods to improve single chromatogram resolution are modest compared to the orders of magnitude needed.
In general, there are two classes of deconvolution methods applied to chromatographic data. One is simply the attempt to resolve peak shoulders and broadening into the separate peaks that make up the resulting response shapes (Felinger 1998). It follows that the resolved components must have responses of roughly the same order of magnitude or there would be no discernible effect on the majority peak shape. In fact, the maximum number of resolved components afforded by such techniques is given by the statistical and Fourier transform analyses referred to above, or roughly 133% of the apparent peak count. To get beyond this modest increase in resolution, one clearly needs additional data. Such additional data can be provided by multiple parallel detection channels having different selectivity. This, in effect, divides the chromatographic response pattern among the several detectors. The greatest gain in this respect is achieved by the largest number of detectors, each monitoring a unique property of the sample components. In this area, the collection of an optical or mass spectrum at successive small increments of chromatographic time affords the greatest amount of useful additional data.
Thirty years ago Biller and Biemann (Biller and Biemann 1974) recognized that the hundreds (or thousands) of independent detection channels of mass spectrometry can help deconvolute overlapping chromatographic peaks and separately characterize each component. When spectra covering a range of masses are collected at a rate that provides at least several spectra per chromatographic peak width, the response at each mass can be plotted as a function of chromatographic time. Such plots are called mass or ion chromatograms. Each plot is effectively that of a mass-selective detector for the chromatogram. Since the number of used channels could reasonably be in the hundreds (or even thousands for high-resolution mass spectra), the chromatographic peak capacity is multiplied by roughly this number. This is a huge gain in peak capacity, especially considering that it requires no additional analysis time to achieve. Despite using a scanning sector mass spectrometer and a crude data system, Biller and Biemann demonstrated an increase in the concentration range of observable components that could be detected. However, as chromatography advanced through narrower peaks and decreased sample size, the ability of mass spectrometers to provide the data required fell behind. Scanning instruments lose sensitivity in proportion to the requisite mass range and scanning rate and their mass chromatograms are not perfectly synchronized.
The importance of a greater concentration range is that it has a huge effect on the number of components in a complex mixture that can be determined. Nagels, et. al. (Nagels, Creten et al. 1983) counted peak frequency vs. peak area in chromatograms of a large number of plant extracts. FIG. 1 is a plot of their data. They demonstrated that the relative response for components of a complex mixture is an approximately exponential function. However, even though they pointed out that an exponential function did not provide a good fit, they are widely cited as evidence that the concentration distribution function is in fact exponential (El Fallah and Martin 1987; Felinger 1998).
However, as explained in further detail below, new mathematical models for determining the total number of components in a complex sample based on the number of detectable components in the sample indicate that a dynamic response in the order of 5 orders of magnitude will be required to detect the top 99% of component responses. This dynamic response must be achieved with short and uniform detector integration times. Such a capability is not available with any of the current mass spectrometer detector systems.
The use of multiple mass chromatograms for the resolution of overlapping chromatographic peaks has evolved into two areas of application. Scanning mass analyzers such as the quadrupole are used in scanning or multiple-ion mode at normal chromatographic speeds. A variety of mathematical approaches to unskew the spectra and determine overlapping compounds has been developed (Sato and Mitsui 1994; Abbassi, Mestdagh et al. 1995; Windig and Smith 2007). For GC/MS, methods of data treatment to extract the mass spectra of overlapped components have been developed by several researchers (Biller and Biemann 1974; Abbassi, Mestdagh et al. 1995; Windig, Phalp et al. 1996; Fraga 2003; Windig and Smith 2007). The losses incurred in scanning result in noisy ion chromatograms for the minor components, although components with peak heights as low as 1/60 of the largest peaks have been detected (Fraga 2003). These methods have been applied to biological studies with LC separation and electrospray ionization in TOFMS systems with high mass resolution at ˜3 spectra per second (Aberg, Torgrip et al. 2008) and in quadruple scanning instruments at 0.1 amu resolution and 1 spectrum per second (Govorukhina, Reijmers et al. 2006). All of the cited examples show results for components having maximum response ratios of only 100:1 at best.
The other area of application of deconvolution aims to reduce the time of gas chromatographic analysis by using short columns, high flow rates and the rapid spectral generation rates afforded by TOFMS analyzers (Holland, McLane et al. 1992; van Deursen, Beens et al. 2000). Component detection with a response range of two orders of magnitude has been demonstrated (Veriotti and Sacks 2001). An analytical instrument based on the use of spectral deconvolution to compensate for the increased component overlap has been commercialized (LECO Corporation). A deconvolution method involving isotope ratios has been used in the analysis of mixtures of polychlorinated compounds by GC/TOFMS (Imasaka, Nakamura et al. 2009). Again, even in these successful approaches, the ratio of peak heights of identified compounds is less than 100:1.
Four commonly used methods of data treatment were recently compared for their effectiveness in reducing the problems of background noise (Fredriksson, Petersson et al. 2007). It is the nature of this noise that is of particular interest. Aberg et al. (Aberg, Torgrip et al. 2008) say, “Much of the chemical noise in the data originates from substances in the analyzed sample that are present at too low concentrations to give stable detectable signals in consecutive scans. . . . Such signals will not be tracked [detected] because they have (i) unpredictable m/z values due to bad ion statistics and (ii) too many scans with missing data, and thus the Kalman filter discards these signals as noise.” The data in all these papers and the emphasis on noise reduction clearly indicate that there is too little ion flux information to obtain reliable signals for those components with responses less than 1% of the most abundant compounds. Batch mass analysis instruments are limited in the ion flux they can tolerate in the mass analyzer (Ion trap, Orbitrap© and FT MS) and the ion detector and limited ion throughput in TOFMS limit the concentration ratio of the most abundant to least detectable component. Therefore, an increase in ion detection rate is key to increasing the useful concentration range of component detection.