Mass Spectrometry (MS) is an analytical tool that measures the mass-to-charge ratio of charged particles, and is widely used for the qualitative and quantitative analysis of chemical compounds including compound identification, as well of interrogation of compound's structure, selective reactivity, stability, etc. Commercially available modern mass spectrometers not only employ different methods of separation of ions, but also vary in evaporation/ionization techniques, as well as detection schemes. This results in an ever-broadening range of scientific applications based on or related to mass spectrometric measurements.
The first generation of commercial mass spectrometers for analytical chemistry used an electron impact ionization technique, which, in its optimal mode of 70 eV electron energy, routinely overexcited analyte molecules, resulting in a rapid gas phase unimolecular decomposition of significant portion of parent ions. This produced a characteristic analyte “signature” spectrum—a mixture of peaks of parent ion and its fragments. These spectra were quickly recorded and organized into the so-called MS-libraries, used even today as an identification tool for mass spectrometry. However, electron impact, a gas-phase ionization technique, relies on up-front evaporation of a sample, easily realized only for volatile low-to-medium mass range analyte molecules. For analytes above 300 Da not only evaporation is problematic, but also electron-impact induced fragmentation becomes too complex.
Discovery of modern “soft” ionization techniques which produce “cold” analyte molecular ions, starting with chemical ionization, solved the problem of post-ionization dissociation and complexity of the immediate spectra, but added to the ambiguity of the parent ion identity assignment. Without the background dissociation, the “signature” feature of an analyte molecular ion was gone. But the only real advance of chemical ionization was to eliminate the post-ionization dissociation. Since it was the first of the soft techniques, scientists had roughly a decade to find ways to re-introduce dissociation into the mass spectrometry—hoping to regain the “signature” feature as a tool to solve the problem of analyte identification. It came in the form of commercially-developed tandem mass spectrometers: devices in which creation of the analyte molecular ion is separated (in space or time) from the event of its fragmentation.
Chemical (charge transfer) ionization saw about a decade of its renaissance between 1975 and 1985, before the advantages of new ionization techniques opened up a new era in analytical mass spectrometry. Fast atom bombardment (FAB), electrospray ionization (ESI) and matrix assisted laser desorption ionization (MALDI) have coupled two principal steps involved in mass spectrometry analysis: evaporation and ionization, thus allowing for mass-spectral analysis of large molecules. It took the scientific community about a decade to catch up with the technology, culminating in the current explosion of mass spectrometry-based applications in analytical synthetic chemistry, pharmacology, ecology, biology, food science, etc.
Regardless of a compound's physical or chemical properties that are of ultimate scientific interest, the first goal of mass spectrometry is to establish the compound's identity. At the most basic level of identification, the molecular formula (elemental composition) of the ion needs to be determined. Higher order information (molecular structure, conformation, stability, etc.) can be revealed through gas-phase chemistry in multi-stage mass spectrometry experiments, and/or in combination with “orthogonal” techniques, such as chromatography, electrophoresis, ion mobility, spectroscopy, etc. For relatively small molecules, knowledge of exact mass and relative abundance of isotopes may be sufficient to reveal molecular formula information, even in the absence of the background fragmentation. In any case, potential candidate molecular formulas need to be either referenced from previously established lists (databases) or generated by nested loop summations, realizing possible combinations of number of atoms of different types (carbon, oxygen, hydrogen, etc.) in attempt to match experimentally observed masses with required precision. Historically, the latter approach of generating formulas as atomic combinations was the only available approach. While the former approach of the so-called “known unknown” target analysis has gained popularity as databases of known compounds are publicly available and continue to grow, even today in special applications such as large polymer synthesis, information offered by public databases may not be sufficient to provide identification base for researcher's needs. In such cases, a mass spectrometry specialist still needs to revisit the old formula generation approach in an attempt to assign molecular formulas to experimentally observed mass spectrometric peaks. Unfortunately, current formula generator algorithms are based on nested loops and are inherently susceptible to exponential dependence of computational cost on (a) the number of atoms types assumed to be comprising the potential formula, and (b) of the mass of the target ion.
Algorithmic improvements to the molecular formula generation models remain very relevant even today, as a potential way to improve the first step of mass spectrometric investigations: identification of the atomic composition of ionic species.