Mass spectrometry is increasingly used not only for identification of samples but also for determination of their absolute or relative quantities. The identification and quantitation of disease-related and/or treatment-related changes in the abundance of biological molecules, such as proteins, is an important area of research. Changes in the abundance of particular proteins or their modifications, for example, can make the difference between health and disease for an organism. Furthermore, the development of proteomics techniques that focus on the identification of protein differences can improve our understanding of disease and the effectiveness of therapeutic interventions. In this area, mass spectrometric based proteomics is now a widely used technology.
Typically, such quantification by mass spectrometry is done by calibration of the mass spectrometric peak of a sample with a peak of a reference (i.e. a sample of known quantity). When only relative quantification is required, this may be done by comparison of two sample peaks or comparison of a sample peak and a reference peak, which is typically isotopically labelled. In one such labelling experiment known as Stable Isotope Labeling by Amino acids in Cell culture (SILAC), two cell populations are fed with an amino acid that is isotopically labelled differently in each case so that proteins containing such peptide will be easily identified in the mass spectrum owing to the known mass difference in the isotopes. The proteins from both cell populations can be combined and analysed together by mass spectrometry and the ratio of their identified peak intensities reflects their abundance. Examples of methods utilising isotopic labelling are described, for example, in Horii, Y. et al, “Polychlorinated Dibenzo-p-dioxins, Dibenzofurans, Biphenyls, and Naphthalenes in Plasma of Workers Deployed at the World Trade Center after the Collapse”, Environmental Science & Technology, American Chemical Society, 2010, 44, 5188-5194; Armenta, J. M. et al, “Differential Protein Expression Analysis Using Stable Isotope Labelling and PQD Linear Ion Trap MS Technology”, J. Am. Soc. Mass Spectrom., 2009, 20, 1287-1302; and Cantin, G. T. et al, “Combining Protein-Based IMAC, Peptide-Based IMAC, and MudPIT for Efficient Phosphoproteomic Analysis”, Journal of Proteome Research, 2008, 7, 1346-1351.
For ease of quantification using labelling, the “MaxQuant” package by the Max Planck Institute (MPI) for Biochemistry, Germany is currently at the forefront of data handling, as described in Cox, J. & Mann, M., “MaxQuant enables high peptide identification rates, individualized ppb-range mass accuracies and proteome-wide protein quantification”, Nature Biotechnology 26, 1367-1372 (2008); and Cox, J. & Mann, M., “Computational Principles of Determining and Improving Mass Precision and Accuracy for Proteome Measurements in an Orbitrap”, Journal of the American Society for Mass Spectrometry, 2009, 20, 1477-148.
In US 2008/091359 and in Marko Sysi-Aho et al: “Normalization method for metabolomics data using optimal selection of multiple internal standards”, BMC BIOINFORMATICS, Biomed Central, London, GB, vol. 8, no. 1, 15 Mar. 2007 (2007 Mar. 15), page 93, a quantitative method for metabolites is described in which multiple internal standards are intentionally spiked into the samples. The standards may have the same chemical structure as some analytes, but they are synthesized using isotopic labelling. Normalization values are calculated in the method based on the known and measured abundances of the limited number of intentionally spiked standards. The same set of standards are used for normalizing all metabolites and there is a clear distinction between the molecules used as standards and those that are analytes. It estimates the chemical similarity of standards to target analytes by measuring covariability of standards and analytes over a set of LC-MS runs and infers weights for internal standards from this covariability. Overall, the algorithm addresses the problem of instrument response undergoing changes over a set of LC-MS runs, i.e. from run to run, and does not consider changes which appear during a single LC-MS run.
The need for isotopic labelling in experiments, however, adds complexity and cost. Furthermore, in some experiments it is not viable to compare peak intensities with calibrants present within the dataset and therefore so-called label-free quantitation is necessary.
A label-free method is described in Wiener, M. et al, Differential Mass Spectrometry: A Label-Free LC-MS Method for Finding Significant Differences in Complex Peptide and Protein Mixtures Analytical Chemistry, 2004, 76, 6085-6096. In that method, an algorithm is used for finding differences in mass spectrometry data taken from two samples. The algorithm uses the mass-to charge ratio (m/z), the retention time and intensity to compare the data from the samples at every (m/z, time) combination. Statistically significant differences based on a t-test, which persist in time (i.e. over a sufficient time range), are used for quantitation.
Commercial software called SIEVE for label-free, semi-quantitative differential expression analysis of proteins, peptides and metabolites is available from Thermo Scientific, which reduces effects of chromatographic variability between samples.
In US 2003-111596 (Becker et al), a quantitative method utilises a normalization (scaling) of the data but the subject of scaling is the whole mass spectrum, i.e. all ion signals appearing in a single mass spectrum. Thus, Becker et al normalize their “peaks” regardless of their retention time, i.e. uniformly across the whole LC-MS experiment. Since Becker et al use the same normalization value for the whole LC-MS run, they compensate for the average difference in instrumental response between different LC/MS runs, but they are unable to compensate for such time-dependent sources of variability, as the fluctuations in the electrospray current, ionization efficiency and instrumental sensitivity that occur within the same LC-MS run.
A label-free approach, however, makes the quantitative data even more susceptible to variations. Such experiments are typically liquid chromatography mass spectrometry (LC/MS) experiments employing electrospray ionization (ESI). Aside from variations in sample preparation and chromatography, which can be minimized, a major contributor to the variation of peptide abundances in a label-free LC/MS experiment is the fluctuation of the ESI current. The fluctuations occur on all time scales, from milliseconds to minutes and hours. Although the total ESI current can be monitored by the instrument and recorded in the dataset, taking it into account presently leads to limited quality improvement or no improvement at all. This is probably because the main contributors to the ESI current are background ions whose composition is very sensitive to the LC gradient, ambient air or nebulizer gas quality and spraying conditions.
In view of the prior art described above, there is a need to improve the accuracy of quantification in LC/MS. It is desirable to bring the accuracy of label-free quantification closer or similar to that of labeling experiments (e.g. iTRAQ, TMT and SILAC). Against this background the present invention has been made.