Methods to identify at least the monoisotopic mass or a parameter correlated the mass of the isotopes of the isotope distribution of one species of molecules, mostly various species of molecules, are in general available. Preferably these methods are used to identify the monoisotopic mass of large molecules like peptides, proteins, nucleic acids, lipids and carbohydrates having typically a mass of typically between 200 u and 5,000,000 u, preferably between 500 u and 100,000 u and particularly preferably between 5,000 u and 50,000 u.
These methods are used to investigate samples. These samples may contain species of molecules which can be identified by their monoisotopic mass or a parameter correlated the mass of the isotopes of their isotope distribution.
A species of molecules is defined as a class of molecules having the same molecular formula (e.g. water has the molecular formula H2O and methane the molecular formula CH4.)
Or the investigated sample can be better understood by ions which are generated from the sample by at least an ionization process. The ions may be preferably generated by electrospray ionization (ESI), matrix-assisted laser desorption ionization (MALDI), plasma ionization, electron ionization (EI), chemical ionization (CI) and atmospheric pressure chemical ionization (APCI). The generated ions are charged particles mostly having a molecular geometry and a corresponding molecular formula. In the context of this patent application the term “species of molecules originated from a sample by at least an ionization process” shall be understood is referring to the molecular formula of an ion which is originated from a sample by at least an ionization process. So, monoisotopic mass or a parameter correlated the mass of the isotopes of the isotope distribution of a species of molecules originated from a sample by at least an ionization process can be deduced from the ion which is originated from a sample by at least an ionization process by looking for the molecular formula of the ion after the charge of the ion has been reduced to zero and changing the molecular formula accordingly to the ionization process as described below.
In the species of molecules all molecules have the same composition of atoms according to the molecular formula. But most atoms of the molecule can occur as different isotopes. For example, the basic element of the organic chemistry, the carbon atom occurs in two stable isotopes, the 12C isotope with a natural probability of occurrence of 98.9% and the 13C isotope (having one more neutron in its atomic nucleus) with a natural probability of occurrence of 1.1%. Due to these probabilities of occurrence of the isotopes particularly complex molecules of higher mass consisting of a higher number of atoms have a lot of isotopomers, in which the atoms of the molecule exist as different isotopes. In the whole context of the patent application these isotopomers of a species of molecule designated as the “isotopes of the species of molecule”. These isotopes have different masses resulting in a mass distribution of the isotopes of species of molecules, named in the content of this patent application isotope distribution (short term: ID) of the species of molecules. Each species of molecules therefore can have different masses but for a better understanding and identification of a species of molecules to each molecule is assigned a monoisotopic mass. This is the mass of a molecule when each atom of the molecule exists as the isotope with the lowest mass. For example a methane molecule has the molecular formula CH4 and hydrogen has the isotopes 1H having on a proton in his nucleus and 2H (deuterium) having an additional neutron in his nucleus. So, the isotope of the lowest mass of carbon is 12C and the isotope of the lowest mass of hydrogen is 1H. Accordingly the monoisotopic mass of methane is 16 u. But there is a small probability of other methane isotopes having the masses 17 u, 18 u, 19 u, 20 u and 21 u. All these other isotopes belong to the isotope distribution of methane and can be visible in the mass spectrum of a mass spectrometer.
The identification of the monoisotopic mass or a parameter correlated the mass of the isotopes of the isotope distribution of at least one species of molecules is by measuring a mass spectrum of the investigated sample with by a mass spectrometer. In general every kind of mass spectrometer can be used known to a person skilled in the art to measure a mass spectrum of the sample. In particular, it is preferred to use a mass spectrometer of high resolution like a mass spectrometer having an ORBITRAP mass analyzer, a FT-mass spectrometer, an ICR mass spectrometer or an MR-TOF mass spectrometer. Other mass spectrometers for which the inventive method can be applied are particularly TOF mass spectrometer and mass spectrometer with a HR quadrupole mass analyzer. But to identify the monoisotopic mass or a parameter correlated the mass of the isotopes of the isotope distribution of species of molecules if the mass spectrum is measured with a mass spectrometer having a low resolution is difficult with the known method of identification, in particular, because neighboring peaks of isotopes having a mass difference of 1 u cannot be distinguished.
On the one hand, molecules already present in the sample are set free and are only charged by the ionization process e.g. by the reception and/or emission of electrons. The method of the invention is able to assign to these species of molecules contained in the sample its monoisotopic mass due to their ions which are detected in the mass spectrum of the mass spectrometer.
On the other hand, the ionization process can change the molecules contained in the sample by fragmentation to smaller charged particles or addition of atoms or molecules to the molecules contained in the sample resulting in larger molecules which are charged due to the process. Also by an ionization process the matrix of a sample can be split into molecules which are charged. So, all these ions are originated from the sample by a described ionization process. So, for these ions the accordingly species of the molecules originated from the sample have to be investigated by a method for identification of the monoisotopic mass or a parameter correlated the mass of the isotopes of the isotope distribution of at least one species of molecules.
To date, many methods to identify monoisotopic masses of isotopic peaks in mass spectra have been published, including Patterson functions, Fourier transforms, or a combination thereof (M. W. Senko et al., J. Am. Soc. Mass Spectrom. 1995, 6, 52; D. M. Horn et al., J. Am. Soc. Mass Spectrom. 2000, 11, 320; L. Chen & Y. L. Yap, J. Am. Soc. Mass Spectrom. 2008, 19, 46), m/z accuracy scores (Z. Zhang & A. G. Marshall, J. Am. Soc. Mass Spectrom. 1998, 9, 225), fits of experimentally observed peak patterns to theoretical models (P. Kaur & P. B. O'Connor, J. Am. Soc. Mass Spectrom. 2006, 17, 459; X. Liu et al., Mol. Cell Proteomics 2010, 9, 2772), and entropy-based deconvolution algorithms (B. B. Reinhold & V. N. Reinhold, J. Am. Soc. Mass Spectrom. 1992, 3, 207). These methods are often targeted at specific applications such as peptides and/or intact proteins, and the reported executing times are in the seconds time range on a 2.2-GHz CPU (Liu et al., 2010), which is not sufficient for an online detection and subsequent selection of species for a further MS analysis, as in standard methods of MS proteomics. An unpublished method of P. Yip et al., has been optimized for the analysis of intact proteins, using a high number of correlations of potentially related peaks, which have been transformed before from the original data to a logarithmic m/z axis with binary intensity information. However, with the speed is not fast enough for the use for a Fourier-transform mass spectrometer. Evidently, a holistic approach, which is not only suitable for a broader range of applications, including peptides, small organic molecules, and intact proteins, but also for a fast online analysis directly after the data acquisition (without delaying the acquisition of subsequent scans), is required for areas of applications where acquisition speed, i.e., the amount of data that can be analyzed experimentally per unit of time, is essential.