The term “tissue state” here means the state of a small subarea of a tissue section with respect to a stress, a pathological change, an infection or other type of change compared with a normal state of this tissue. The tissue state must therefore be identifiable as a concentration pattern of substances which can be detected in this small subarea by a mass spectrometer. The substances can be peptides or proteins which are under- or overexpressed and hence form a pattern, or they can include positranslational modifications of proteins, their breakdown products (metabolites), or collections of other substances in the tissue.
Mass spectrometry with ionization of the samples by matrix-assisted laser desorption and ionization (MALDI) has been used successfully for several years for the determination of molecular weights, and for the identification and structural characterization of proteins. In this case, the protein is usually dissolved and mixed with a solution of a matrix substance such as sinapic acid before being applied to the sample support. The solvent then evaporates and the matrix substance crystallizes, the protein crystallizing with it in the matrix crystals. Bombarding the sample obtained in this way with sufficiently energetic short pulses of laser light leads to the matrix substance absorbing energy and evaporating explosively as a result. The proteins are entrained into the gaseous cloud inside the mass spectrometer and ionized by protonation. The ions are then separated in the mass spectrometer according to their mass-to-charge ratios (m/z) and measured as a mass spectrum. Their mass can be determined from the mass spectrum. Since ionization by matrix-assisted laser desorption essentially provides only singly charged ions, in the following, we will simply refer to “mass determination” and not determination of the mass-to-charge ratios and, correspondingly, just the “mass” of the ions instead of their m/z-ratio.
These analyses can be carried out on biological samples, such as tissue homogenates, lyzed bacteria or biological fluids (urine, blood serum, lymph, spinal fluid, tears, sputum), the samples generally being subjected to sufficient fractionation beforehand by chromatographic or electrophoretic techniques.
For this purpose it is advisable to free the samples from interfering impurities, such as certain buffers, salts or detergents, which reduce the efficiency of the MALDI analyses. The analysis of biological samples usually involves very time-consuming sample preparation, particularly if, at the same time, information concerning the distribution of a protein in different regions of a tissue is to be obtained. “Laser capture microdissection”, for example, can achieve this, but the time-consuming processing described above is still necessary; there is also the difficulty of obtaining sufficient material for this type of analysis.
Imaging mass spectrometry (IMS) makes it unnecessary to go to these lengths. With this method, a microscopic tissue section is produced from a piece of tissue taken from a human or animal organ of interest using a microtome, for example, and laid on a specimen slide. A matrix capable of absorbing laser energy is then applied to the surface of the specimen, for example by pneumatic spraying onto a moving support (U.S. Pat. No. 5,770,272; Biemann et al.). There are two different methods for the subsequent mass spectrometric scan: The raster scan method and stigmatic imaging of the ions of a small region.
The raster scan method produces a one- or two-dimensional intensity profile for individual proteins by scanning a microscopic tissue section with well-focused laser beam pulses in a MALDI mass spectrometer, the proteins being identifiable in the mass spectra (U.S. Pat. No. 5,808,300; Caprioli). Each spot is therefore irradiated at least once with a finely focused pulse of laser light and provides a mass spectrum which can cover a broad range of molecular weights, for example 1 to 30 kilodaltons. Using suitable software, it is then possible to define an ion mass, which represents a peptide or a protein, or a narrow mass range around this mass, in the spectra and to graphically represent its intensity distribution over the surface of the microscopic tissue section. Using this method, it has been possible to correlate the distribution of neuropeptides in the brain of a rat with specific morphological features, for example, or to depict the distribution of amyloid beta peptides in the brains of Alzheimer animal models. It is possible to visualize sections of the brain affected by “Alzheimer plagues” with precise spatial definition (Stoeckli M, Staab D, Staufenbiel M, Wiederhold K H, Signor L, Anal Biochem. 2002, 311, 33-39: Molecular imaging of amyloid beta peptides in mouse brain sections using mass spectrometry).
The method of stigmatic imaging irradiates a defined area of up to 200 by 200 micrometers with the laser pulse. The ions formed over the area are imaged ion-optically, spot by spot on a spatially-resolving detector. So far, it has been possible to scan distribution images of these ion masses by selecting individual ion masses with this method (S. L. Luxembourg et al., Anal. Chem. 2003; 75, 1333-41); it is to be expected, however, that very fast cameras will be able to scan complete mass spectra for every spot of the area.
A considerable disadvantage of both methods is the fact that, until now, only individual features in these types of spectra have been utilized analytically, for example a peptide present in a high concentration, which is particularly typical of certain tissue states within a tissue sample. This procedure has limited the method until now and prevented a broader application for those tissue states which cannot be attributed to the appearance of one single peptide or protein.
Independently of such imaging methods, targeted searching for “markers” has developed as an interesting field of clinically oriented research (W. Pusch et al., Pharmacogenetics 2003; 4, 463-476). Here, bodily fluids such as blood, urine or spinal fluid, but also tissue extracts, are typically processed into coarse fractions with a less complex analyte composition by extracting them with- chromatographic phases, solid phase extraction or other selective methods before they are mass spectrometrically characterized. The mass spectra obtained by this method display a more or less complex pattern which originates from peptides and proteins. By comparing the mass spectra of samples from healthy and sick individuals it is possible, in individual cases, to find single peptides or proteins which are characteristic of the medical condition of the individuals.
However, there is a general opinion that interesting distinguishing features with better statistical evidence can only be discovered when this method is performed on dozens or hundreds of samples from two so-called cohorts of individuals—one cohort serving as a reference and one cohort in which certain peculiarities or deviations in the spectra are expected because a specific clinical picture, such as intestinal cancer or prostate cancer, is present.
This approach has achieved preliminary successes with the discovery of distinct and statistically significant protein signals in the case of samples from ill persons. In the literature, however, a vehement argument is in progress about whether these markers can be used for diagnosis or not since, as yet, it has not been possible to establish whether these markers might simply be indicative of the patient's type of medication or a general stress situation associated with the illness. For the licensing of such markers for general diagnostic purposes, the United States FDA (Food and Drug Administration) now requires, as a minimum, that the protein found as a marker is unambiguously identified and that knowledge of the protein and its function (or its breakdown pathway, if the substance in question is a breakdown product) is used to at least establish the plausibility of a link with the illness concerned.
The objective of these analyses is naturally to make an early prediction about the possible development or proliferation of various diseases in the future of an individual. It is hoped that it will be possible to identify cancer at a very early stage, for example, and therefore to have a much better chance of fighting it.
In general, however, the mass spectra of the various cohorts do not contain any simple features such as a few individual signals whose intensities differ significantly in the cohorts. Complex mathematical-statistical analyses of the mass spectra of the various cohorts must therefore usually be carried out. These analyses can be carried out using a plurality of methods, which analyze whether it is possible to distinguish between the cohorts of healthy patients and sick patients unambiguously and to a statistically significant degree on the basis of groups of features in the mass spectra.
It is, for example, possible for a principal component analysis (PCA) to determine whether cohorts of sick individuals (or, where possible, even several cohorts with several related diseases) can be distinguished from each other and from cohorts of healthy reference individuals. If this is the case, a further mathematical computational method can use the mass spectrometric signals to calculate disease-specific distinguishing characteristics which make it possible to unambiguously identify the state of an individual with respect to a specific disease. Suitable mathematical transformations can, for example, make it possible for the disease-specific distinguishing characteristics to cover the range from minus infinity to plus infinity, for example, where all values less than zero correspond to a healthy state and all values greater than zero to a diseased one. A very simple distinguishing characteristic can be a simple concentration ratio of two proteins, for example, where the range extends from zero to infinity. Alternatively, the distinguishing characteristics are transformed in such a way that they cover the range from zero to one: healthy state close to zero, diseased state close to one. The detailed computational method for calculating the distinguishing characteristics (both the algorithm and the parameter values) is saved and later used for the diagnosis of this disease using mass spectra scanned from this individual's samples.
Genetic algorithms (GA) generate a decision path along which the medical condition of an individual can be determined. A logical expression can be obtained from the decision path which, in turn, can be transformed into a characteristic which distinguishes between different states. This logical computational method is also saved and later used for diagnosis of other samples.
Other methods for analyzing the differentiation have been elucidated, including: linear discriminance analysis (LDA), support vector machines (SVM), neuronal networks (NN), learning vector quantification (LVQ).
From the results of such statistical analyses, it is ultimately possible to obtain detailed computational methods (algorithms plus parameter sets) to calculate distinguishing characteristics that are represented as mathematical or logical expressions, each incorporating several spectra signals. These can also include very weak spectral signals. The distinguishing characteristics also seem to make it possible to represent more subtle differences between samples from different cohorts. However, the number of samples required easily runs into thousands.
It is a considerable problem here that the variation in the ion signals in the individual mass spectra, even within one cohort (patient or healthy), is large, and, for example, the age distribution in a group or the gender-specific distribution can have much more influence than the effect which is to be investigated. One of the reasons for this is the fact that the analysis of bodily fluids only provides a remote—or indirect—picture of the occurrence of the disease at the site of action (for example the tumor or the brain in the case of neuro-degenerative diseases). According to present expectations, the problem of the search for markers could be simplified if it were possible to compare healthy and diseased samples from a single individual. But this is not possible when the samples are bodily fluids because of their homogenization in the body, and can, at best, be determined as a temporal variation over relatively long periods of time.