The present invention generally relates to the processing of digital images of histopathological tissue specimens to provide for the objective characterization and analysis of tissue structural features (xe2x80x9ctissue informationxe2x80x9d). Such tissue information allows for comparison and combination with tissue information obtained through studies taking place at different times, with different protocols for the collection, preservation and histological staining of the tissue. The present invention specifically relates to a system and method, based on a physical model for absorption of light by histological stains, to measure the amount of stain within specific locations within tissue.
Generally, accurate and repeatable quantitative analysis of tissue is important to characterize the progression of various pathologies, and to evaluate effects that new therapies might have. To date, little, if any, reliable structural information exists at the tissue level (1-1000 microns, that is, in the range microscopic to mesoscopic). It is believed that if reliable, multi-dimensional tissue structural information existed in readily accessible databases capable of continuous assimilation with newly acquired information, including clinical and molecular (including genetic) information, such information would serve to enhance and accelerate new advances in tissue engineering, drug design and development, gene discovery, proteomics, and genomics research.
Specifically, this invention pertains to the improvement of methods of analysis of digital images of histopathological tissue specimens through the improved detection and interpretation of stains in such specimens. The development of an automated analysis system to identify and quantify structural features of tissues provides for objective comparison between tissues and allows for closer correlation between the presentations of such features and concurrent patterns of gene or protein expression.
There are several requirements for an automated analysis system. An automated system must be able to run un-attended for several hours. Slide feeding, slide positioning, tissue detection, auto-focus, and image acquisition, are all steps that must be accomplished before stain analysis can begin. This automated process mimics the manual process of capturing and saving tissue images by a pathologist. The technical hurdle in this step is to make this process robust even when histology and tissue placement are poor. This invention discloses a stain detection technique that enables the creation of a robust analysis system.
An example delineates the problem that is solved. Antibody staining is used to detect the presence of specific proteins in tissue, so that an improved method for detection and interpretation of stains in antibody staining helps researchers rapidly identify antibodies that may have potential as therapeutics. Specific antibody staining also locates structural features for objective classification and quantification, as where the CD31 antibody labels endothelial growth factors suggestive of new blood vessel formation. When horseradish peroxidase is used as a marker, the colored compound formed preferentially absorbs green and blue light with little absorption of red light. Where specific antibody staining of tissue occurs, it usually occurs to such a degree that the resulting image shows very strong absorption of green and blue light. The resulting observed color is dark reddish brown. In the areas where the tissue exhibits weak staining, the color varies over a range of light to dark reddish brown. Even if the tissue exhibits areas where staining is weak, this is not significant since the tissue has already been identified as showing specific staining because of the existence of the dark stained areas. As a result, the presence of specific staining can usually be detected by the existence of color within a small range of a single color. Specific stain is determined to exist within a tissue if a percentage of tissue has the color of dark specific staining and this percentage is above a user specified threshold value.
There is a problem if the tissue only exhibits weak specific staining. Non-specific staining exhibits a much smaller degree of staining, with the resulting image showing much less absorption of green and blue light. The resulting color is a light brown. The single color approach to specific stain detection may not work. Instead, the color that is an indication of specific staining can fall in a range between two colors, from dark reddish brown to light reddish brown. The problem then is to detect weak specific staining while rejecting non-specific staining and maintaining a low false positive detection rate.
The present invention overcomes the problems of the current art. Present visual/manual analysis of tissue is slow, difficult, prone to error, and subjective. Variability in specimen preparation and stain formulation reduces comparability among tissues or tissue sets through visual analysis or application of automated or computer-aided classifiers that apply an external reference for associating stain color with certain tissue components. The present invention describes how color information of a stained tissue image may be transformed to yield the type and amount of stain at each pixel of the stained tissue image. The present invention describes how statistically significant results may be obtained quickly.
The present invention is generally directed to the processing of digital images of histopathological tissue specimens to provide for the objective characterization and analysis of tissue structural features (xe2x80x9ctissue informationxe2x80x9d) by means of a robust automated analysis system. The development of a robust system to identify and quantify structural features of tissues provides for objective comparison between tissues and allows for closer correlation between the presentations of such features and concurrent patterns of gene or protein expression.
The present invention specifically discloses a physical model of stain absorption that relates the amount of stain present in an area of a stained histological specimen with the color and intensity of light transmitted through the specimen. This invention allows for improved automated detection of stained tissue areas as well as quantifications of the amount of stain present in each tissue area. Because the relative absorption of different stains is a means by which tissue structures are made visible, the disclosed computer-assisted detection of these stains facilitates the analysis of tissue structures.
In general aspects, the present invention is a method for analyzing the amount of stain on tissue specimens, which includes capturing an image of a tissue specimen. A background image is gathered from a region of a substrate (e.g., a microscopic slide used for mounting tissue sections) without any tissue mounts. A mean value in each band of a background image is calculated. Information from the background image to adjust the color image is used. A fixed number of points from the color image is randomly sampled.
Then, a principal component analysis to obtain three vectors are performed. Finally, the color image is transformed to a colorspace in which the colors of the colorspace show the type and amount of stain present.
The transmitted intensity of monochromatic light passing through an absorbing medium is measured by I=I0exe2x88x92xcex1s where I0 is the incident light intensity, xcex1 is known as the absorption coefficient of the absorbing medium, and s is the product of the concentration of the dye in a stained specimen (medium) and the thickness of the tissue containing a given stain(s) that the light passes through.
In one aspect, the present invention is a method for analyzing the amount of stain on tissue specimens, which includes the steps of capturing an image of a stained tissue specimen; gathering a background image from a substrate used for supporting tissue specimens; calculating a mean value in each band of a background image; randomly sampling a fixed number of points from the color image; performing a principal component analysis to obtain three vectors; and transforming the color image to a colorspace in which the colors of the colorspace show the type and amount of stain present.
In staining reactions, dyes (hematoxylin and/or eosin (for HandE stain) may be dissolved in the stained substance. A dye may be absorbed on the surface of a structure (e.g., antibody) or dyes may be precipitated within the structure depending on pH, temperature etc of staining solution. Color may vary with specific staining solutions used. A substance that is stained by the basic dye ((e.g., HandE dyes) is basophilic. A staining solution containing hematoxylin and eosin dyes is one such staining solution.
In another aspect, a method for analyzing the amount of stain on tissue specimens includes the steps of capturing an image of a stained tissue specimen; transforming the color image to a colorspace representation in which points are represented by triplets of numbers comprising intensities of red, green, and blue colors; defining a stain curve defined in terms of a light source and a certain point in the color image; calculating the distance in colorspace along the stain curve from the triplet representing the light source to the triplet representing the certain point; and calculating the amount of stain.
In still another aspect, a method for analyzing the amount of stain on tissue specimens includes the steps of capturing a color image of a tissue specimen; forming a three-dimensional colorspace describing the color image; performing a principal component analysis on the three-dimensional colorspace; using the results of the principal component analysis to form a transformation matrix; and calculating the amount of stain by using the transformation matrix.
In yet another aspect, a method for analyzing the amount of stain on tissue specimens includes the steps of staining a tissue specimen; creating a colored image of the tissue specimen, wherein the image is comprised of pixels, wherein each pixel comprises a red, green and blue sub-pixel, wherein each sub-pixel has an intensity and the intensities of each pixel define a triple of intensity values; gathering a background image; converting the triple of intensity values into a triple of optical density values; randomly sampling the optical density values to obtain a number of values large enough to obtain statistically significant results; performing a principal component analysis on the randomly sampled optical density values to obtain a new coordinate system; and using the new coordinate system to define a transformation matrix to convert the triple of optical density values into a triple of values, wherein a component of the triple of values is proportional to the amount of stain at that point.
In still another aspect, a method for analyzing the amount of stain on tissue specimens includes the steps of staining a tissue specimen such that a feature of the tissue specimen is stained; creating a colored image of the tissue specimen, wherein the image is comprised of pixels, wherein each pixel comprises a red, green and blue sub-pixel, wherein each sub-pixel has an intensity and the intensities of each pixel define a triple of intensity values; gathering a background image; converting the triple of intensity values into a triple optical density values; randomly sampling the optical density values to obtain a number of values large enough to obtain statistically significant results;
performing a principal component analysis on the randomly sampled optical density values to obtain a new coordinate system; and using the new coordinate system to define a transformation matrix to convert the triple of optical density values into a triple of values, wherein a component of the triple of values is proportional to the amount of stain at that point and wherein the amount of stain at that point is proportional to the amount of the tissue feature.
In a further aspect, a method for analyzing the amount of stain on tissue specimens includes the steps of staining a tissue specimen with two stains, a first stain and a second stain; creating a colored image of the tissue specimen, wherein the image is comprised of pixels, wherein each pixel comprises a red, green and blue sub-pixel, wherein each sub-pixel has an intensity and the intensities of each pixel define a triple of intensity values; converting the triple of intensity values into a triple of difference values linearly related to the amount of stains;
randomly sampling the difference values to obtain a number of values large enough to obtain statistically significant results; performing a principal component analysis on the randomly sampled difference values to obtain a new coordinate system; and using the new coordinate system to define a transformation matrix to convert the triple of difference values into a triple of values, wherein one component of the triple of values is proportional to the amount of the first stain at that point and a second component of the triple of values is proportional to the amount of the second stain at that point.
In another aspect, a method for analyzing the amount of stains on tissue specimens wherein there are three stains on a stained tissue specimen. This method includes the steps of staining a tissue specimen with three stains, a first stain, a second stain, and a third stain; creating a colored image of the tissue specimen, wherein the image is comprised of pixels, wherein each pixel comprises a red, green and blue sub-pixel, wherein each sub-pixel has an intensity and the intensities of each pixel define a triple of intensity values; gathering a background image;using a filter to filter out the color of the third stain;
converting the triple of intensity values into a triple of difference values linearly related to the amount of stains; randomly sampling the difference values to obtain a number of values large enough to obtain statistically significant results; performing a principal component analysis on the randomly sampled difference values to obtain a new coordinate system; and using the new coordinate system to define a transformation matrix to convert the triple of difference values into a triple of values, wherein one component of the triple of values is proportional to the amount of the first stain at that point and a second component of the triple of values is proportional to the amount of the second stain at that point.