Pathologists examine tissue under a microscope to discern if there are any deviations from normal that indicate injury or disease. This practice is prone to subjectivity, resulting in significant variations between experienced pathologists. But quantitative tissue analysis based on automated image analysis has the potential to reduce or eliminate subjectivity, yielding a more objective basis for diagnosis and a course of treatment. Quantitative tissue analysis also has a large potential role in research, allowing for rapid and automated processing of large amounts of histopathological data, as is required by for example the Human Protein Atlas Project [1].
Pathologists rely on multiple, contrasting stains for tissue analysis. For example hematoxylin, which stains cell nuclei blue, is usually combined with the counter-stain eosin to stain cytoplasm pink and stromal components in various grades of red/pink, providing local color-contrast. But while pathologists can effectively use color in combination with texture and morphological features for visual analysis, automated tissue recognition based on color is fraught with problems. First, there can be large inter- and intra-specimen variations in stain intensity due to tissue preparation factors, including variations in stain concentration, staining duration, tissue thickness, and in fixation. In order to use color as a basis for diagnosis, it is essential that tissue classification be based only on the tissue absorption characteristics for a specific stain without the influence of variations that are introduced in specimen preparation [2].
A second set of problems is the result of aliasing in the image acquisition process, both in the spectral and spatial domains. Different stains may have overlapping absorption spectra, requiring a method that classifies portions of pixels into the correct stain/tissue combinations. Instead of classifying a pixel with two or more stain/tissue combinations as one of the combinations (binary classification), soft classification rules separate the relative contributions of the stain in each pixel yielding a more accurate classification into density maps. Similarly, aliasing due to limited spatial resolution or tissue thickness may result in multiple tissue components, e.g., cell nuclei and cytoplasm, to be collocated within a single pixel. Again, for a more accurate classification, the relative contributions of each stain/tissue combination within pixels must be separated.
A third problem is the result of the photon noise at image acquisition. Standard three-channel CCD sensors have a linear response to the number of incident photons and the dominant noise is Poisson distributed photon noise. Introduction of noise modeling into the decomposition increases the accuracy of the results.
Color decomposition is a technique developed in fluorescence microscopy based on ideas from remote sensing. Keshava and Mustard [3] describe spectral unmixing as a procedure requiring determination of reference spectra, or colors, and decomposition, i.e., the extraction of a set of gray-level images showing individual contributions of the pixels to each spectral band.
While multispectral solutions offer the advantage that filters may be matched to several stains [4], multispectral imaging is more costly and more time consuming than three-channel imaging, where three channel imaging generally refers to red-green-blue imaging, or RGB imaging. Furthermore Boucheron et al. show that multi-spectral imaging gives only a statistically insignificant increase in performance in histological image analysis [5].
Color decomposition methods in the literature differ in terms of the reference color determination and the algorithm for the actual decomposition of the original image into density maps [6]. Some methods determine the reference color in a color space, while others model light absorption and thus need to model light scattering stains separately. Some reference color determination methods may require user input, while others are completely automated. Decomposition may be implemented either through binary or soft pixel classification. With linear decomposition, a soft pixel classification technique based on the linear mixture model, it is possible to estimate the density information on a subpixel level. Finally, only some methods handle linearly dependent color signatures.
Reference color determination in histological applications often relies on clustering techniques implemented directly in color space, without any consideration for stain/tissue interactions or properties of the sensors. Such methods [7, 8, 9] result only in binary classification which in general leads to loss of information [10].
Color deconvolution [10] is a decomposition method for transmission bright-field microscopy similar to Castleman's color compensation used in fluorescence microscopy [11]. In color deconvolution the user manually selects regions in a training image for each stain/tissue combination. This is followed by a transformation of the data by Beer-Lambert's law and a computation of normalized average red-green-blue values for each selected stain/tissue combination. These normalized color vectors are then used to build a mixing matrix for the decomposition of the histological tissue image data into density maps, one for each stain/tissue combination.
Blind methods, borrowed from remote sensing, for determining stain/tissue combination reference colors, are based on non-negative matrix factorization (NMF), independent component analysis (ICA), or principal component analysis (PCA). The purpose of these methods is to derive a mixing matrix for multispectral analysis [6, 12]. Recently Begelman et al. [13] showed excellent results for hyperspectral data using sparse component analysis. Nevertheless, only NMF [14] and PCA [15] have been tested using three-color image data.
Following decomposition, soft pixel classification is often implemented as a matrix multiplication by the pseudo inverse of the mixing matrix [3, 10, 14, and 16]. Therefore, the method requires all reference colors of the identified stain/tissue combinations to yield well-conditioned mixing matrices.
Spectral angle mapping [2, 17, and 18] offers a stable solution even when the mixing matrix is medially-conditioned and allows for a greater number of stain/tissue combinations than color channels. However, the output images of spectral angle mapping are binary, that is the mapping does not use linear decomposition but rather nearest neighbor pixel classification by spectral angles.