Mass spectrometry is an analytical tool that can be used to determine the molecular weights of chemical compounds by generating ions from the chemical compounds, and separating these ions according to their mass-to-charge ration (m/z). The ions are generated by inducing either a loss or a gain of a charge by the chemical compounds, such as via electron ejection, protonation, or deprotonation. The ions are then separated according to their m/z values and detected. The resulting data are often presented as a spectrum, a two-dimensional (2-D) plot with m/z ratio on the x-axis and abundance of ions on the y-axis. Thus, this spectrum shows the distribution of m/z values in the population of ions being analyzed. This distribution is characteristic for a given compound. Therefore, if the sample is a pure compound or contains only a few compounds, mass spectrometry can reveal the identity of the compound(s) in the sample.
Electrospray ionization mass spectra of biological macromolecules and protein complexes contain series of ion signals corresponding to the same chemical species in a sequence of charge states. The masses and intensities (ion currents) of the analyzed chemical species, as represented by an entire neutral-mass spectrum, can be inferred from the mass over charge measurements by computational deconvolution.
A complex sample usually contains too many chemical compounds to be analyzed meaningfully by mass spectrometry alone, because ionization of different chemical compounds may result in ions with the same m/z value. The more chemical compounds a sample contains, the more likely ions of the same m/z values will be generated from different compounds. Therefore, a complex sample is typically resolved to some extent prior to mass spectrometry, such as by liquid chromatography, gas chromatography, or capillary electrophoresis. In this sample separation step, the chemical compounds in the sample are separated based on how long they stay in the sample separation medium. Once a chemical compound goes through the sample separation medium, it enters a mass spectrometer system, and the ionization/ion separation/detection process begins as described above. The resulting data for each ion thus has one more property, retention time, which is the time the chemical compound that gives rise to the ion stays in the sample separation medium. Thus, mass spectral data of a sample that is analyzed by a sample separation method before mass spectrometry can be presented as a three-dimensional (3-D) plot, with retention time, m/z value and ion abundance on the three axes of the plot.
Even with a sample separation method, it is still not an easy task to analyze mass spectral data from a complex sample due to the vast number of peaks.
All charge deconvolution algorithms in use today are iterative algorithms that converge to a deconvolved neutral mass spectrum along with charge distributions for the neutral masses that together explain the observed m/z (mass over charge) spectrum. The most widely used deconvolution algorithm, with implementations called MaxEnt and ReSpect, was developed about 25 years ago and licensed to most of the mass spectrometry (MS) instrument manufacturers. This algorithm converges to a deconvolved neutral mass spectrum that optimizes an objective function that measures the quality of the result using criteria such as fit to the observed data, peak width, correlation between neighbouring charge states, and—its defining characteristic—the Shannon entropy of the neutral-mass spectrum. A more recent algorithm, UniDec, leaves out the entropy term, and builds in expected correlation between neighboring charge states by blending them with a smoothing filter. UniDec also includes specific support for ion mobility data and nanodisk analysis. Other recent work has focused on peak enhancement of m/z spectra in order to improve the performance of maximum entropy charge deconvolution for native mass spectrometry.
Methods to deconvolute mass spectral data based on compound properties such as isotopic clusters (see, e.g., U.S. 2007/0176088) have been proposed. In one method, 3-D peaks that share the same retention time are examined, and isotopic clusters of the same compound are grouped together, thereby reducing the complexity of the mass spectral data significantly. This method, however, is most useful for analytes with relatively small molecular weights. Large molecules, such as most intact proteins, are often too large for their isotopomers to be resolved in a mass spectrometer. As a result, an accurate monoisotopic mass cannot be calculated for the given isotopic cluster using the charge state spacing of the isotopomers.
Deconvolution methods transform an m/z (mass divided by charge) spectrum to a neutral mass spectrum by deducing the charges of the ions in the m/z spectrum, and then multiplying m/z values by the appropriate values of z (charge) and subtracting the masses of the charge carriers (typically protons) to determine neutral mass. Charge is deduced by relationships among peaks in the m/z spectrum, relying on the fact that an ion with charge 50+ is also likely to be observed with charges 48+, 49+, 51+, 52+, etc. Two types of artifacts are commonly observed: “harmonic” artifacts (akin to harmonics in acoustic signals) in which charge 50+ might be mistaken for 25+, 52+ mistaken for 26+, and so forth, and “off-by-one” artifacts in which charge 50+ is mistaken for 49+ or 51+. A neutral mass spectrum with harmonic and off-by-one artifacts may actually fit the observed m/z spectrum better than one without artifacts, because the artifacts have many more degrees of freedom to explain small shape and intensity variations in the observed m/z peaks. Current charge deconvolution algorithms based on “maximum entropy” all give some level of artifact, because the algorithms do not have special steps to bias against artifacts, and indeed entropy is larger for a mass spectrum with artifacts.
Despite these problems the most common method for intact protein mass determination is the maximum entropy deconvolution method. As mentioned, MaxEnt biases output towards smoother (higher entropy) mass spectrum, which may reduce background noise and resolves closely spaced masses, however it may also suppress small signals and retain “harmonic” artifacts. Furthermore, even with relatively narrow target mass ranges, off-by-one charge assignments produce another type of artifact, side lobes on either side of the true masses, for example 3000 Daltons (Da) too low and high if the strongest m/z signal is around m/z 3000. Both harmonic and off-by-one artifacts increase entropy of the deconvolution, so the entropy term in the objective function, which helps the algorithm resolve closely spaced masses, has the undesired side effect of promoting artifacts. Artifacts are a minor problem in some scenarios, but they can be quite misleading in other practical applications: (1) automated workflows that forego expert human inspection; (2) analysis of antibodies, including bispecifics, where harmonic artifacts may be mistaken for half-mAbs, aggregations, or mispairings; (3) antibody-drug conjugates (ADCs), where off-by-one artifacts may bias quantitation of drug loading; and (4) heavily glycosylated or other highly modified proteins.
Regardless of the algorithmic details, the deconvolution iteration generally converges to a local rather than a global optimum. Two important user-controlled parameters for deconvolution are the input m/z range and the output mass range. Deconvolution algorithms usually assume that all the ions (except perhaps some low-charge m/z peaks, recognizable by resolved isotopes) in the input range represent chemical species in the mass range. This assumption allows deconvolution of lower signal-to-noise spectra by limiting the number of masses and charges that the algorithm must consider, but it runs the risk that chemical species outside the mass range may be undetected or give false additional masses within the user-set target mass range. A practical solution entails deconvolution of the m/z range onto a wide mass range to survey the masses, followed by deconvolution of selected m/z ranges onto narrow mass ranges to capture more detailed information.
It is desirable to have a better method for deconvoluting complex mass spectral data from samples comprising large molecules. Thus, it would be beneficial to provide methods and apparatuses that address the problems described above.