In recent years, mass spectrometers capable of performing MSn analysis are widely used for structure analysis of various polymer compounds including protein. Specifically, when an ion originating from a substance of interest contained in a sample is dissociated by collision induced dissociation (CID), a molecular bond is broken at a specific site depending on the bond energy or other factors, and various product ions and neutral losses are produced. Therefore, an ion having a specific mass-to-charge ratio m/z corresponding to a substance of interest is selected from various ions produced from a sample, the selected ion is dissociated by CID, and various product ions (fragment ions) produced by the dissociation are subjected to mass spectrometry to obtain an MS2 spectrum. Since the MS2 spectrum includes information about various fragments (including product ions and neutral losses) originating from the substance of interest, the chemical structure of the substance of interest can be estimated by analyzing the MS2 spectrum data. In the case where the ion cannot be dissociated into sufficiently small mass-to-charge ratios by only one time of the CID operation, the structural analysis of the substance of interest utilizing the MSn spectrum (where n is equal to or greater than 3) obtained by repeating the CID operation a plurality of times is sometimes performed.
On the other hand, in a mass spectrometer equipped with an ion source by electron ionization (EI) or the like, a peak of a product ion or peaks of product ions fragmented from an ion originating from a sample component can be obtained in an MS1 spectrum by a method called in-source decay. Therefore, there sometimes occurs a case that the structural analysis of the substance of interest can be performed utilizing the MS1 spectrum in which a product ion or ions originating from the substance of interest is observed as described here instead of utilizing the MSn spectrum. Hereinafter, the MS1 spectrum and the MSn spectrum (where n is equal to or greater than 2) in both of which a product ion or ions are observed are together simply referred to as the MSn spectrum.
The most general method for estimating the structure of an unknown substance using the MSn spectrum is a database search utilizing the comparison of the MSn spectrum pattern. Specifically, the compound name, the molecular weight, the composition formula, structural formula, the MSn spectrum pattern, and other data are registered in a database (sometimes referred to as a library) for identification with regard to various known compounds, and when a measured MSn spectrum is obtained for an unknown substance, the unknown substance is identified and the structural formula is drawn out by searching, on the database, for a compound the peak pattern of which matches or similar to the measured MSn spectrum under a predetermined search condition. For such an identification database, databases created by users themselves, and various existing databases that are open to the public provided by public institutions are utilized.
Though the amount of data stored in the identification database as described above is generally huge, all the compounds that may become the object of analysis are not stored in the database. For example, among agricultural chemicals, pharmaceuticals, or metabolites produced in vivo from such substances, there exist many analogous compounds in which the basic skeletons of the compounds are common but only a part of the structure is substituted (for example, a methyl group is substituted with an ethyl group, or chloride is substituted with bromide). It is practically impossible to store all such compounds in the identification database. Therefore, it often occurs that a substance cannot be identified and the structural formula cannot be determined even if the database search is performed for such a compound.
In order to solve the problem, in the method for mass spectrometry data analysis method described in Patent Literature 1, in analyzing the structure of the unknown substance the structure of which is known to be similar to the structure of a known substance having a known structure, the structure of an unknown substance is estimated by a combination of a fragment prediction and a known structural change pattern, where the fragment prediction is a prediction of fragments for a peak or peaks having a mass-to-charge ratio or ratios m/z appearing commonly in MSn spectra of a known substance and in an unknown substance. The “structural change pattern” is the information about replacement of substituents, addition of a component, elimination of a component, or the like. Owing to such estimation, it becomes possible to identify and determine the structural formula for a compound that is not stored in the identification database.
However, in the method for mass spectrometry data analysis described in Patent Literature 1, the structure cannot be estimated for an unknown substance in which a structural change that is not registered as a structural change pattern has occurred. For example, the structure of a drug metabolite is considered to have a structure formed in such a way that a partial structure in a substance before metabolism eliminates and then another partial structure is added to the elimination site or another site different from the elimination site. In this case, the eliminating partial structure varies widely depending on the substance before the metabolism, and it is difficult to register all the eliminating partial structures as structural change patterns in advance. For such a substance, therefore, it is likely to occur that the structure of an unknown substance cannot be determined.