In recent years, chromatograph mass spectrometry combining chromatography and mass spectrometry is widely used in various fields, such as medicines, pharmaceuticals, food, and environment. There are various analysis methods for data obtained by chromatograph mass spectrometry. One of them is differential analysis in which difference among the data of two or more groups is examined An example of the differential analysis is an analysis processing for finding a specific protein that is not seen in a non-patient and is seen only in a cancer patient (that is, a biomarker for the cancer disease). In chromatograms and mass spectrums, peaks that are not present in the group of biological samples taken from a plurality of non-patients and are present in the group of biological samples taken from a plurality of cancer patients are searched.
A procedure for the above differential analysis using a liquid chromatograph mass spectrometry apparatus will be outlined.
First, one or a plurality of proteins contained in a sample taken from a subject (non-patients or patient) are broken down into a plurality of peptides with a digestive enzyme to obtain a mixture of peptides. Next, the peptide mixture is introduced into a liquid chromatograph to separate the peptides according to their own retention times. Every specimen thus separated according to the retention time containing a peptide is measured by a mass spectrometry apparatus to obtain data in which the signal intensities of ions derived from the peptide are included.
FIG. 7 is a schematic diagram of three-dimensional chromatogram data obtained in this manner. That is, data obtained by a liquid chromatograph mass spectrometry apparatus is a collection of data of a signal intensity (intensity of ions) at a certain time and a certain mass-to-charge ratio m/z.
A peak corresponding to a particular peptide appears at a retention time (RT) corresponding to the peptide and a mass-to-charge ratio m/z corresponding to the peptide. In other words, the position of a peak derived from a peptide is represented by [RT, m/z] coordinates, or a two-dimensional vector of the two components. Therefore, in order to examine whether or not a peptide is present in a certain sample group, one can examine whether or not a peak corresponding to the peptide is present at the position [RT, m/z] . However, generally, the reproducibility of chromatogram data (that is, reproducibility in the time direction) is lower than the reproducibility of mass spectrum data (that is, reproducibility in the mass-to-charge ratio direction). Therefore, for the same substance, the retention time of the corresponding peak may be different depending on the sample, the analysis conditions, or other factors.
Thus, generally, before actually searching for a difference in peaks among a plurality of groups, correction in the time axis is performed within each group and further among the groups, so that mass spectrum peaks derived from the same substance appear at the same retention time position. One method for such correction is “RT alignment using TICs” in which correction is performed using total ion chromatograms (TICs) obtained by plotting along time the sum of signal intensity values in a mass spectrum at each measurement time point.
In the following description, unless particularly described, “RT alignment using TICs” is simply referred to as “RT alignment.” In other words, “RT alignment” means “RT alignment using TICs” in the present specification.
When a liquid chromatograph and a mass spectrometry apparatus are used in combination, an ion source by atmospheric pressure ionization, such as electrospray ionization (ESI) or atmospheric pressure chemical ionization (APCI), is often used. Generally, the reproducibility of mass spectra obtained in such a liquid chromatograph mass spectrometry apparatus is high, and for the same peptide, peak having substantially the same signal intensity is obtained. Therefore, the reproducibility of the waveform of TICs is also high, and the similarity in the waveform of a plurality of TICs obtained for specimens of the same type is high.
Making use of such similarity in the waveform of TICs, chromatogram peaks are corrected in the time axis by RT alignment using a known algorithm, such as dynamic programming (DP) so that the TIC waveform of a sample designated as “Treatment” is as close as possible to the TIC waveform of a sample designated as “Control.” In addition, at this time, the analyst changes parameters, such as the calculation conditions of dynamic programming, variously and visually compares the TIC waveform of the Treatment and the TIC waveform of the Control after the correction, and searches for the most appropriate parameters (that is, such parameters that the positions of the peaks and the waveforms in the two TICs are closest in terms of time). Thus, good RT alignment can be achieved.
For example, in Non-Patent Document 1, RT alignment by dynamic programming using a correlation coefficient is disclosed. One example of the results of actually performing RT alignment using this technique is shown in FIG. 8.
FIG. 8(a) is the TIC of a sample designated as Control, and FIG. 8(b) is the TIC of a sample designated as Treatment. A graph in which the TIC of the Control and the TIC of the Treatment after RT alignment are overlap-displayed is shown in FIG. 8(c). In addition, a graph in which the time range around 25 to 35 minutes in FIG. 8(c) is enlarged is shown in FIG. 8(d). From this diagram, it is found that the positions of the peak tops and the peak bottoms, the peak widths, and the like in both TIC waveforms match quite well. In other words, in this case, it can be said that the RT alignment is performed with good precision.
But, according to the study of the inventors of this application, it has become clear that the above-described RT alignment method is suitable for three-dimensional chromatogram data acquired by a liquid chromatograph mass spectrometry apparatus using an ion source by ESI or the like, but is not suitable for three-dimensional chromatogram data acquired by a liquid chromatograph mass spectrometry apparatus using an ion source by matrix-assisted laser desorption ionization (MALDI). The reason is thought to be as follows.
The RT alignment by dynamic programming or the like described above presumes that the similarity in the waveform of a plurality of TICs (for example, the TICs of Control and Treatment) to be aligned is rather high, and the similarity of the TICs are increased by shifting the retention times of peaks around areas in which the match between the TIC waveforms is low, or by other measures. However, in a MALDI ion source, variations in the amount of ions produced for one laser pulse irradiation are large, and the reproducibility of ion production efficiency for each mass-to-charge ratio is not very good. Therefore, in mass spectra obtained by a MALDI mass spectrometry apparatus, the reproducibility of the signal intensities of peaks is low, and the reproducibility of a total signal intensity value at a certain measurement time point obtained by adding all signal intensity values for each mass spectrum is also low. As a result, the similarity in TIC waveform among different samples is also poor. Among TICs having poor similarity in this manner, even if dynamic programming can be executed, retention times cannot be suitably corrected.
In addition, as described above, whether or not the RT alignment is successful is judged by the analyst visually checking the overlap display of TICs as shown in FIGS. 8(c) and 8(d). For this purpose, at the stage before retention times are corrected, the similarity in the two TIC waveforms must be high to a certain extent. But, when a MALDI ion source is used as described above, the similarity in the TIC waveforms among different samples is poor. Therefore, even if overlap display is performed, it is difficult for the analyst to visually determine whether or not the RT alignment is properly executed. In addition, therefore, the optimal adjustment of parameters for dynamic programming is also difficult.