The invention provides methods for normalizing mass spectra acquired by imaging mass spectrometry (IMS), particularly by imaging tissue sections using matrix assisted laser desorption/ionization (MALDI). Histology is the science of human, animal and plant tissues, in particular, their structure and function. A histologic examination of a tissue sample determines the kind and state of the tissue, e.g. the type(s) and differentiations of the tissue sample, bacterial and parasitic pathogens in the tissue sample, the disease state of the tissue sample or any other change compared to a normal state.
In routine examination, the kind and state of a tissue sample are determined by optically imaging tissue sections, acquired by microscopes or scanners. Usually, the tissue sections are only a few micrometers thick and are stained to increase the contrast of the optical images and emphasize structures in the tissue sections. Histology has mainly been based on morphologic characteristics since the kind and state of a tissue sample are determined according to the presence of specific structures of tissue and cells and their staining properties.
Imaging mass spectrometry (IMS) is a technique used to determine (and visualize) the spatial distribution of compounds in a sample by acquiring spatially resolved mass spectra. In recent years, IMS is increasingly used to analyze the spatial distributions of compounds in tissue sections (Caprioli; U.S. Pat. No. 5,808,300 A), particularly by using matrix assisted laser desorption/ionization (MALDI). However, IMS can also be used to analyze other types of samples, like plates of thin layer chromatography (Maier-Posner; U.S. Pat. No. 6,414,306 B1), gels of an electrophoresis or blot membranes. All spatially resolved mass spectra of a sample constitute a mass spectrometric imaging data set S(x,y,m). The mass spectrometric imaging data set S(x,y,m) of a sample can be viewed as a collection of multiple mass images S(x,y,mk) of different masses or mass ranges mk, that is, S(x,y,m) can be divided into mass ranges each generating a mass image.
Caprioli has established a raster scan method to acquire spatially resolved MALDI mass spectra of tissue sections. A tissue section is prepared on a sample plate with a matrix layer and then scanned with laser pulses of a focused laser beam in the x- and y-directions, often with several hundred pixels in both directions. In order to raster an entire tissue section, the sample plate is moved by a stage along the x- and y-direction. Every pixel (focus region of the laser beam) on the tissue section is irradiated at least once in the imaging process, and usually ten to a hundred times. The ions generated in the multiple MALDI processes are analyzed in a mass analyzer, most often a time-of-flight mass spectrometer with axial ion injection. The multiple mass spectra acquired at a single pixel are added to a sum spectrum and the sum spectrum is assigned to the pixel.
If the concentrations of compounds are sufficiently high in the tissue section, the spatial distribution can be determined by IMS. The tissue section is characterized by the spatial distribution of compounds, i.e. by molecular information. The compounds can be all kinds of biological substances, like proteins, nucleic acids, lipids and sugars, or drugs. Chemical modifications of compounds, in particular posttranslational modifications of proteins and metabolites of drugs, can be determined across the tissue section. In general, IMS generates spatially resolved mass spectra and thus provides high content molecular information as well as morphologic information, the latter at a limited spatial resolution compared with the optical images.
According to Suckau et al. (U.S. Pat. No. 7,873,478 B2), the spatial distribution of a tissue kind and state can be determined by combining at least two different mass signals at each pixel with predetermined mathematical or logical expressions to generate a measure representing the tissue kind and state at that spot. The different mass signals represent different compounds, i.e., that two or more different mass images are combined with predetermined mathematical or logical expressions to a state image of the tissue section. The state image is often displayed together with an optical image of the tissue section.
Normalization is the process of multiplying (or dividing) a mass spectrum with an intensity-scaling factor (normalization factor f) to expand or reduce the range of the intensity axis. It is used to compare mass spectra of varying intensity (Baggerly 2003, Morris 2005, Norris 2007, Smith 2006, Villanueva 2005, Wagner 2003, Wolski 2006, Wu 2003; see list at the end of the disclosure). In general, a mass spectrum S is a vector of multiple intensity values si (i=1 . . . N) at corresponding masses mi. The mass spectrum S is multiplied or divided by the normalization factor to generate a normalized mass spectrum.
Intrinsic properties of a tissue and the preparation of a tissue section for MALDI imaging may influence the normalization of the acquired mass spectra and can lead to artifacts in normalized mass images. For example, an inhomogeneous spatial distribution of salts or endogenous compounds can suppress the formation of ions in the MALDI process and lead to an inhomogeneous mass image of a compound that is homogeneously distributed in the tissue section. The mass signals of lipids being present in the tissue can be much more intense than signals of peptides or proteins. Therefore, there is risk that highly concentrated lipids suppress the formation of peptide and protein ions.
Further, MALDI imaging requires the preparation of a matrix layer on the tissue section. The properties of the matrix layer, particularly the size of matrix crystals and their spatial distribution on the tissue section, can affect mass signals of compounds, like proteins, irrespective of their concentration in the tissue section. That is of interest since the resolution of a MALDI mass image can actually be higher than the size of the matrix crystals. A contamination of the MALDI ion source can fade the image brightness during the acquisition of the entire MALDI imaging data set.
Besides using an optimized and stable preparation, the influence of the tissue and its preparation on mass images can be minimized by proper normalization. A failure to apply normalization can also lead to artifacts in mass images. A normalization is also required to compare mass spectra across different imaging data sets in cohort studies, e.g., for biomarker discovery.
The most commonly used normalization procedures in mass spectrometry are normalization on the total ion count (TIC) as well as the vector norm. The TIC-norm and the vector norm are special cases of the so called p-norm of a mass spectrum S:
          S        =            (                        ∑          i                ⁢                  S          i          p                    )              1      /      p      
For p=1, the normalization is based on the sum of all intensity values si in the mass spectrum S, which is equal to the total ion count (TIC). The TIC-normalized mass spectra have the same integrated area under the spectrum. The normalization factor of the TIC norm is:
      f    TIC    =            ∑      i        ⁢                        s        i                  
For p=2, the p-norm equals the vector norm. The normalization factor of the vector norm is:
      f    vector    =                    ∑        i            ⁢              S        i        2            
For p→∞, the p-norm leads to the maximum norm, in which the normalization is done on the most intensive peak of the mass spectrum (and which is sometimes used in LC-MS based label-free approaches). The larger the exponent p becomes, the higher the influence of intensity signals on the result of the normalization becomes. This is also true for noise spectra. In the maximum norm, the highest intensity value in a noise spectrum will be the same as the highest intensity pixel of the most intense signal of other spectra. Noise spectra are therefore considerably amplified by increased p, and are therefore expected to be least problematic in TIC normalization.
The TIC-normalization and the vector norm as well are based on the assumption that a comparable number of signals is present with more or less similar intensities in all mass spectra to be normalized. This assumption is fulfilled for samples, like serum samples or homogenized tissue samples, where only a few signal intensities change against an otherwise constant background. In mass spectrometric imaging data sets, one cannot trust that this condition is met because different types of tissue (or cells) may be present in the same tissue section. As a consequence, it is possible to compare expression levels across samples for comparable types of tissue after TIC normalization. However, the error can be high when comparing expression levels between different types of tissue expressing a heterogeneous set of compounds with quite different spatial distributions. In certain cases, the TIC normalization can produce misleading results and possibly lead to wrong conclusions, e.g., regarding the spatial distribution of a potential biomarker, drug or metabolite of a drug. This is typical for tissues in which abundant signals are present in confined areas, such as insulin in the pancreas or beta-amyloid peptides in the brain. The question of whether or not MALDI imaging datasets should be normalized, and the optimal model to do so, is still subject of intense debate at conferences or MALDI imaging workshops.
In principle, every mass spectrometer analyzes ions according to the ratio of their mass to the number of their unbalanced elementary charges (m/z, also termed the “charge-related mass”). Since MALDI is of particular importance for acquiring spatially resolved mass spectra and provides only singly charged ions, the term “mass” rather than “charge-related mass” will be used below only for the sake of simplification. Spatially resolved mass spectra of mass spectrometric imaging data sets can be acquired with different kinds of mass spectrometers. At present, time-of-flight mass spectrometers (TOF-MS) with axial ion injection are mainly used for MALDI imaging, but time-of-flight mass spectrometers with orthogonal ion injection, ion traps (electrostatic or high frequency) or ion cyclotron resonance mass spectrometers can also be used therefore.