Field of the Subject Disclosure
The present subject disclosure relates to digital pathology. More particularly, the present subject disclosure relates to color unmixing methods and systems for a multiplex IHC image that can accommodate any number of stain colors.
Background of the Subject Disclosure
Multiplex immunohistochemistry (IHC) staining is an emerging technique for the detection of multiple biomarkers within a single tissue section and has become more popular due to its significant efficiencies and the rich diagnostic information it has. A multiplex IHC slide has the potential advantage of simultaneously identifying multiple biomarkers in one tissue section as opposed to single biomarker labeling in multiple slides. Therefore, it is often used for the simultaneous assessment of multiple hallmarks of cancerous tissue. Often, a cancerous tissue slide is stained by the multiplex assay to identify biomarkers. For example, tumors in human often contain infiltrates (e.g., T-cells or B-cells) of immune cells, which may prevent the development of tumors or favor the outgrowth of tumors. In this scenario, multiple stains are used to target different type of immune cells and the population distribution of each type of the immune cells are used to study the clinical outcome of the patients. The stained slide is then imaged, for example, using a CCD color camera mounted on a microscope or a scanner.
In order to conduct accurate detection and classification of the cells, the cells are stained, for example, with chromogenic dyes, fluorescent markers and/or quantum dots, and then imaged. The image is unmixed to obtain the constituent dyes and/or the proportions of each dye in the color mixture, as a prerequisite step for multiplex image analysis, for example, multiplex IHC image analysis. Several techniques exist in the prior art to decompose each pixel of the RGB image into a collection of constituent stains and the fractions of the contributions from each of them. For example, color unmixing or deconvolution is used to unmix the RGB image with up to three stains in the converted optical density space. Given the reference color vectors xi∈R3 of the pure stains, the method assumes that each pixel of the color mixture y∈R3 is a linear combination of the pure stain colors and solves a linear system to obtain the combination weights b∈RM. The linear system is denoted as y=Xb, where X=[x1, . . . , xM](M≤3) is the matrix of reference colors. This technique is most widely used in the current digital pathology domain, however, the maximum number of stains that can be solved is limited to three as the linear system is deficient for not enough equations (X being a 3×M matrix). The color unmixing problem may be formulated into a non-negative matrix factorization and color decomposition performed in a fully automated manner, wherein no reference stain color selection is required. This method also solves for y=Xb and has the same limitation in dealing with large stain numbers. A color space may be divided into several systems with up to three colors by solving a convex framework, with a linear system being used to solve each individual system. Due to the independent assignment of each pixel into different systems, the spatial continuity is lost in the unmixed images and artifacts such as holes are observed.
Other methods may work for a larger number of stain colors, such as two-stage methods developed in the remote sensing domain to first learn the reference colors from the image context and then use them to unmix the image, however, these methods are designed to work for multi-spectral image unmixing which has more color channels than the RGB image. Sparse models for high dimensional multi-spectral image unmixing adopt the L0 norm to regularize the combination weights b of the reference colors hence leading to a solution that only a small number of reference colors are contributed to the stain color mixture, but these are also designed for multi-spectral images and do not use any prior biological information about the biomarkers, which may lead to undesired solutions for real data. Moreover, these methods cannot be applied to RGB images due to the image acquisition system, i.e. multi-spectral imaging instead of a CCD color camera to capture the image using a set of spectral narrow-band filters. The number of filters K can be as many as dozens or hundreds, leading to a multi-channel image that provides much richer information than the brightfield RGB image. The linear system constructed from it is always an over-determined system with X being a K×M(K>>M) matrix that leads to a unique solution, however, the scanning process in the multi-spectral imaging system is very time consuming and only a single field of view manually selected by a technician can be scanned instead of the whole slide, thereby limiting the usage of such methods.
Therefore, there exists no numerical solution for unmixing an image having more unknown variable than the number of equations in the least squares system. To accurately unmix an IHC image and differentiate all the stains used is of tremendous clinical importance since it is the initial key step in multiplex IHC image analysis of digital pathology. Due to the limitations of a CCD color camera, an acquired RGB or brightfield image only contains three channels, the unmixing of which into more than three colors is a challenging task. Accordingly, a method for unmixing, which compensates for the limitations of the CCD color camera, is desirable.