The present invention relates to automated diagnostic techniques in medicine and biology, and more particularly to neural network for multi-spectral segmentation of nuclear and cytoplasmic objects.
Automated diagnostic systems in medicine and biology often rely on the visual inspection of microscopic images. Known systems attempt to mimic or imitate the procedures employed by humans. An appropriate example of this type of system is an automated instrument designed to assist a cyto-technologist in the review or diagnosis of Pap smears. In its usual operation such a system will rapidly acquire microscopic images of the cellular content of the Pap smears and then subject them to a battery of image analysis procedures. The goal of these procedures is the identification of images that are likely to contain unusual or potentially abnormal cervical cells.
The image analysis techniques utilized by these automated instruments are similar to the procedures consciously, and often unconsciously, performed by the human cyto-technologist. There are three distinct operations that must follow each other for this type of evaluation: (1) segmentation; (2) feature extraction; and (3) classification.
The segmentation is the delineation of the objects of interest within the micrographic image. In addition to the cervical cells required for an analysis there is a wide range of xe2x80x9cbackgroundxe2x80x9d material, debris and contamination that interferes with the identification of the cervical cells and therefore must be delineated. Also for each cervical cell, it is necessary to delineate the nucleus with the cytoplasm.
The Feature Extraction operation is performed after the completion of the segmentation operation. Feature extraction comprises characterizing the segmented regions as a series of descriptors based on the morphological, textural, densitometric and colorimetric attributes of these regions.
The Classification step is the final step in the image analysis. The features extracted in the previous stage are used in some type of discriminant-based classification procedure. The results of this classification are then translated into a xe2x80x9cdiagnosisxe2x80x9d of the cells in the image.
Of the three stages outlined above, segmentation is the most crucial and the most difficult. This is particularly true for the types of images typically encountered in medical or biological specimens.
In the case of a Pap smear, the goal of segmentation is to accurately delineate the cervical cells and their nuclei. The situation is complicated not only by the variety of cells found in the smear, but also by the alterations in morphology produced by the sample preparation technique and by the quantity of debris associated with these specimens. Furthermore, during preparation it is difficult to control the way cervical cells are deposited on the surface of the slide which as a result leads to a large amount of cell overlap and distortion.
Under these circumstances a segmentation operation is difficult. One known way to improve the accuracy and speed of segmentation for these types of images involves exploiting the differential staining procedure associated with all Pap smears. According to the Papanicolaou protocol the nuclei are stained dark blue while the cytoplasm is stained anything from a blue-green to an orange-pink. The Papanicolaou Stain is a combination of several stains or dyes together with a specific protocol designed to emphasize and delineate cellular structures of importance for pathological analysis. The stains or dyes included in the Papanicolaou Stain are Haematoxylin, Orange G and Eosin Azure (a mixture of two acid dyes, Eosin Y and Light Green SF Yellowish, together with Bismark Brown). Each stain component is sensitive to or binds selectively to a particular cell structure or material. Haematoxylin binds to the nuclear material colouring it dark blue. Orange G is an indicator of keratin protein content. Eosin Y stains nucleoli, red blood cells and mature squamous epithelial cells. Light Green SF yellowish acid stains metabolically active epithelial cells. Bismark Brown stains vegetable material and cellulose.
The combination of these stains and their diagnostic interpretation has evolved into a stable medical protocol which predates the advent of computer-aided imaging instruments. Consequently, the dyes present a complex pattern of spectral properties to standard image analysis procedures. Specifically, a simple spectral decomposition based on the optical behaviour of the dyes is not sufficient on its own to reliably distinguish the cellular components within an image. The overlap of the spectral response of the dyes is too large for this type of straight-forward segmentation.
The use of differential staining characteristics is only the means to the end in the solution to the problem of segmentation. Of equal importance is the procedure for handling the information provided by the spectral character of the cellular objects when making a decision concerning identity.
In the art, attempts have been made to automate diagnostic procedures, however, there remains a need for a system for performing the segmentation process.
The present invention provides a Neural-Network Assisted Multi-Spectral Segmentation (also referred to as the NNA-MSS) method and system.
The first stage according to the present invention comprises the acquisition of three images of the same micrographic scene. Each image is obtained using a different narrow band-pass optical filter which has the effect of selecting a narrow band of optical wavelengths associated with distinguishing absorption peaks in the stain spectra. The choice of optical wavelength bands is guided by the degree of separation afforded by these peaks when used to distinguish the different types of cellular material on the slide surface.
The second stage according to the invention comprises a neural-network (trained on an extensive set of typical examples) to make decisions on the identity of material already deemed to be cellular in origin. The neural network decides whether or not a picture element in the digitized image is nuclear or not nuclear in character. With the completion of this step the system can continue on applying a standard range of image processing techniques to refine the segmentation. The relationship between the cellular components and the transmission intensity of the light images in each of the three spectral bands is a complex and non-linear one. By using a neural network to combine the information from these three images it is possible to achieve a high degree of success in separating the cervical cell from the background and the nuclei from the cytoplasm. A success that would not be possible with a set of linear operations alone.
The diagnosis and evaluation of Pap smears is aided by the introduction of a differential staining procedure called the Papanicolaou Stain. The Papanicolaou Stain is a combination of several stains or dyes together with a specific protocol designed to emphasize and delineate cellular structures of importance to pathological analysis. The stains or dyes included in the Papanicolaou Stain are Haematoxylin, Orange G and Eosin Azure (a mixture of two acid dyes, Eosin Y and Light Green SF Yellowish, together with Bismarck Brown). Each stain component is sensitive to or binds selectively to a particular cellular structure or material. Haematoxylin binds to the nuclear material colouring it dark blue; Orange G is an indicator of keratin protein content; Eosin Y stains nucleoli, red blood cells and mature squamous epithelial cells; Light Green SF yellowish stains metabolically active epithelial cells; Bismarck Brown stains vegetable material and cellulose.
According to another aspect of the invention, three optical wavelength bands are used in a complex procedure to segment Papanicolaou-stained epithelial cells in digitized images. The procedure utilizes standard segmentation operations (erosion, dilation, etc.) together with the neural-network to identify the location of nuclear components in areas already determined to be cellular material.
The purpose of the segmentation is to extract the cellular objects, i.e. to distinguish the nucleus of the cell from the cytoplasm. According to this segmentation the multi-spectral images are divided into two classes: cytoplasm objects and nuclear objects, which are separated by a multi-dimensional threshold t which comprises a 3-dimensional space.
The neural network according to the invention comprises a Probability Projection Neural Network (PPNN). The PPNN according to the present invention features fast training for a large volume of data, processing of multi-modal non-Gaussian data distribution, good generalization simultaneously with high sensitivity to small clusters of patterns representing the useful subclasses of cells. In another aspect, the PPNN is implemented as a hardware-encoded algorithm.
In one aspect, the present invention provides a method for identifying nuclear and cytoplasmic objects in a biological specimen, said method comprising the steps of: (a) acquiring a plurality of images of said biological specimen; (b) identifying cellular material from said images and creating a cellular material map; (c) applying a neural network to said cellular material map and classifying nuclear and cytoplasmic objects from said images.
In second aspect, the present invention provides a system for identifying nuclear and cytoplasmic objects in biological specimen, said system comprising:. (a) image acquisition means for acquiring a plurality of images of said biological specimen; (b) processing means for processing said images and generating a cellular material map identifying cellular material; (c) neural processor means for processing said cellular material map and including means for. classifying nuclear and cytoplasmic objects from said images.
In a third aspect, the present invention provides a hardware-encoded neural processor for classifying input data, said hardware-encoded neural processor comprising: (a) a memory having a plurality of addressable storage locations; (b) said addressable storage locations containing classification information associated with the input data; (c) address generation means for generating an address from said input data for accessing the classification information stored in said memory for selected input data.