1. Field of the Invention
The present invention relates to methods and systems for the digital processing of radiological images, and it more specifically relates to an automated method and system for the re-screening and detection of abnormalities, such as lung nodules in radiological chest images, using multi-resolution processing, digital image processing, fuzzy logic and artificial neural networks.
2. Background Art
Lung cancer is the leading type of cancer in both men and women worldwide. Early detection and treatment of localized lung cancer at a potentially curable stage can significantly increase the patients"" survival rate. Studies have shown that approximately 68% of retrospectively detected lung cancers were detected by one reader and approximately 82% were detected with an additional reader as a xe2x80x9csecond-readerxe2x80x9d. A long-term lung cancer screening program conducted at the Mayo Clinic found that 90% of peripheral lung cancers were visible in small sizes in retrospect, in earlier radiographs.
Among the common detection techniques, such as chest X-ray, analysis of the types of cells in sputum specimens, and fiber optic examination of bronchial passages, chest radiography remains the most effective and widely used method. Although skilled pulmonary radiologists can achieve a high degree of accuracy in diagnosis, problems remain in the detection of the lung nodules in chest radiography due to errors that cannot be corrected by current methods of training even with a high level of clinical skill and experience.
An analysis of the human error in diagnosis of lung cancer revealed that about 30% of the misses were due to search errors, about 25% of the misses were due to recognition errors, and about 45% of the misses were due to decision-making errors. Reference is made to Kundel, H. L., et al., xe2x80x9cVisual Scanning, Pattern Recognition and Decision-Making in Pulmonary Nodule Detectionxe2x80x9d, Investigative Radiology, May-June 1978, pages 175-181, and Kundel, H. L., et al., xe2x80x9cVisual Dwell Indicates Locations of False-Positive and False-Negative Decisionsxe2x80x9d, Investigative Radiology, June 1989, Vol. 24, pages. 472-478, which are incorporated herein by reference. The analysis suggested that the miss rates for the detection of small lung nodules could be reduced by about 55% with a computerized method. According to the article by Stitik, F. P., xe2x80x9cRadiographic Screening in the Early Detection of Lung Cancerxe2x80x9d, Radiologic Clinics of North America, Vol. XVI, No. 3, December 1978, pages 347-366, which is incorporated herein by reference, many of the missed lesions would be classified as T1M0 lesions, the stage of non-small cell lung cancer that Mountain, C. F. xe2x80x9cValue of the New TNM Staging System for Lung Cancerxe2x80x9d, 5th World Conference in Lung Cancer Chest, 1989 Vol. 96/1, pages 47-49, which is incorporated herein by reference, indicates has the best prognosis (42%, 5 year survival). It is this stage of lung cancer, with lesions less than 1.5 cm in diameter, and located outside the hilum region that need to be detected by a radiologist in order to improve survival rates.
Computerized techniques, such as computer aided diagnosis (CAD), have been introduced to assist in the diagnosis of lung nodules during the stage of non-small cell lung cancer. The CAD technique requires the computer system to function as a second physician to double check all the films that a primary or first physician has examined. Reduction of false positive detection is the primary objective of the CAD technique in order to improve detection accuracy.
Several CAD techniques using digital image processing and artificial neural networks have been described in numerous publications, exemplary of which are the following, which are incorporated herein by reference:
U.S. Pat. No. 4,907,156 to Doi et al. describes a method for detecting and displaying abnormal anatomic regions existing in a digital X-ray image. A single projection digital X-ray image is processed to obtain signal-enhanced image data with a maximum signal-to-noise ratio (SNR) and is also processed to obtain signal-suppressed image data with a suppressed SNR. Then, difference image data are formed by subtraction of the signal-suppressed image data from the signal-enhanced image data to remove low-frequency structured anatomic background, which is basically the same in both the signal-suppressed and signal-enhanced image data. Once the structured background is removed, feature extraction is performed. For the detection of lung nodules, pixel thresholding is performed, followed by circularity and/or size testing of contiguous pixels surviving thresholding. Threshold levels are varied, and the effect of varying the threshold on circularity and size is used to detect nodules. For the detection of mammographic microcalcifications, pixel thresholding and contiguous pixel area thresholding are performed. Clusters of suspected abnormalities are then detected. However, the algorithm described in the Doi et al. patent seems to reduce false positive rates at the expense of missing several true nodules. This prior art is limited in detection of nodules with size larger than its pre-selected size xe2x88x921.5 cm. This prior art will also reduce the sensitivity by selecting fixed CDF thresholds (e.g., 97%, 94%, 91%, etc.) since some true nodules will be eliminated during this thresholding process. The algorithm described in the Doi et al. patent utilizes a single classifier (a decision tree classifier) which possesses inherent performance. A decision tree classifier performs classification by eliminating true positives in a sequential way; hence, it is easy to eliminate potential nodules in the first decision node even if the rest of the decision criteria are satisfied. Another important drawback to this prior art is that the physician has to examine every film with both true and false positives identified by the CAD system, so the time spent on the diagnosis increases dramatically.
U.S. Pat. No. 5,463,548 to Asada et al. describes a system for computer-aided differential diagnosis of diseases, and in particular, computer-aided differential diagnosis using neural networks. A first design of the neural network distinguishes between a plurality of interstitial lung diseases on the basis of inputted clinical parameters and radiographic information. A second design distinguishes between malignant and benign mammographic cases based upon similar inputted clinical and radiographic information. The neural networks were first trained using a hypothetical database made up of hypothetical cases for each of the interstitial lung diseases and for malignant and benign cases. The performance of the neural network was evaluated using receiver operating characteristics (ROC) analysis. The decision performance of the neural network was compared to experienced radiologists and achieved a high performance comparable to that of the experienced radiologists. However, Asada""s method seems limited to the detection of lung diseases but not lung cancer, which presents different symptoms.
Y. S. P. Chiou, Y. M. F. Lure, and P. A. Ligomenides, xe2x80x9cNeural Network Image Analysis and Classification in Hybrid Lung Nodule Detection (HLND) Systemxe2x80x9d, Neural Networks for Processing III Proceedings of the 1993 IEEE-SP Workshop, pp. 517-526. The Chiou et al. article describes a Hybrid Lung Nodule Detection (HLND) system based on artificial neural network architectures, which is developed for improving diagnostic accuracy and speed of lung cancerous pulmonary radiology. The configuration of the HLND system includes the following processing phases: (1) pre-processing to enhance the figure-background contrast; (2) quick selection of nodule suspects based upon the most pertinent feature of nodules; and (3) complete feature space determination and neural classification of nodules. The Chiou et al. article seems to be based on U.S. Pat. No. 4,907,156 to Doi et al., but adds a neural network approach. The Chiou et al. system includes similar shortcomings to those in the Doi et al. system described in U.S. Pat. No. 4,907,156.
S. C. Lo, J. S. Lin, M. T. Freedman, and S. K. Mun, xe2x80x9cComputer-Assisted Diagnosis of Lung Nodule Detection Using Artificial Convolution Neural Networkxe2x80x9d, Proceeding of SPIE Medical Imaging VI, Vol. 1898, 1993, describes a nodule detection method using a convolutional neural network consisting of a two-dimensional connection trained with a back propagation learning algorithm, in addition to thresholding and circularity calculation, morphological operation, and a 2-D sphere profile matching technique. The use of a very complicated neural network architecture, which was originally developed for optical character recognition in binary images, the lengthy training time, and the lack of focus on the reduction of false positives, renders the published nodule detection methods impractical. This prior art also possesses similar drawbacks to the Doi et al. system described in U.S. Pat. No. 4,907,156.
S. C. Lo, S. L. Lou, S. Lin, M. T. Freedman, and S. K. Mun, xe2x80x9cArtificial convolution neural network techniques for lung nodule detectionxe2x80x9d, IEEE Trans. Med. Imag. Vol 14, pp 711-718, 1995, describes a nodule detection method using a convolutional neural network consisting of a two-dimensional connection trained with a back propagation learning algorithm, in addition to thresholding and circularity calculation, morphological operation, and a 2-D sphere profile matching technique. This prior art also possesses similar drawbacks to the Doi et al. system and the system described in Lo, et al., 1993.
J.-S. Lin, P. Ligomenides, S.-C. B. Lo, M. T. Freedman, S. K. Mun, xe2x80x9cA Hybrid Neural-Digital Computer Aided Diagnosis System for Lung Nodule Detection on Digitized Chest Radiographsxe2x80x9d, Proc. 1994 IEEE Seventh Symp. on Computer Based Medical Systems, pp. 207-212, describes a system for the detection and classification of cancerous lung nodules utilizing image processing and neural network. However, the system described in this article suffers from similar shortcomings as the system described in the Lo et al. article.
M. L. Giger, xe2x80x9cComputerized Scheme for the Detection of Pulmonary Nodulesxe2x80x9d, Image Processing VI, IEEE Engineering in Medicine and Biology Society, 11th Annual International Conference (1989), describes a computerized method to detect locations of lung nodules in digital chest images. The method is based on a difference-image approach and various feature-extraction techniques, including a growth test, a slope test, and a profile test. The aim of the detection scheme is to direct the radiologist""s attention to locations in an image that may contain a pulmonary nodule, in order to improve the detection performance of the radiologist. However, the system described in this article suffers from similar shortcomings to the system described in Doi et al.
One object of this invention is to provide a novel classification method, based on fuzzy logic, soft optimization, and feature selection techniques, for automated nodule identification in computer-aided diagnosis. The invention further enables the identification of lung nodules in which classification is feature-based. The invention also can be used for other classification problems and detection of diseases, including but not limited to microcalcification clusters, masses and tumors.
Additionally, the invention may be embodied as a computer programmed to carry out the inventive method, as a storage medium storing a program for implementing the inventive method, and as a system for implementing the method.
The automated classification method of the present invention uses several advanced techniques, such as fuzzy logic, optimized linear partition, and feature-weighted detector (FWD) network to eliminate false positive nodules, thus greatly improving performance. Once image data is acquired from a radiological chest image, the data is subjected to a multi-phase digital image processing technique to initially identify several suspect regions. First, during the image enhancement phase, object-to-background contrast of the data is enhanced using multi-resolution matching techniques. Next, during the quick selection phase, the data is subjected to sphericity testing, involving examination of circularity parameters of each grown region in a sliced (thresholding) image obtained from a series of pixel threshold values, and segmentation of suspect object blocks to preliminarily select nodule candidates. The pixel threshold values are derived based on the desired suspect nodule area (SNA) number and size, signal-to-noise ratio (SNR) of the image, and CDF of the image in order to have maximal sensitivity. In the feature extraction phase, plausible features of nodules are further extracted from the corresponding region. In the classification phase, the objective of identifying a true positive nodule among suspect nodule candidates is achieved according to the present invention by using following steps: a) normalizing feature values; b) selecting meaningful features from among the normalized features; c) pre-grouping nodules in several sub-groups according to their selected (meaningful) features; d) subjecting the objects to trained linear classifiers corresponding to their sub-groups; and e) using a trained fuzzy classifier to further discriminate true positive nodules from false positive nodules. In the final phase, the decision making phase, the suspect nodules are analyzed using prevalence rate, risk factor, and system performance to determine portion of the films for further reviewing. The use of these multiple resolution techniques, multiple classifiers, and presentation of a portion of highly suspect films for a physician""s final diagnosis eliminates a high number of false positives experienced using prior art techniques and increases the detection accuracy.
According to the method of the present invention, step a) includes employing vertical offset normalization to remove insignificant yet strong characteristics in data, thereby enhancing weak yet significant information for identifying true positives.
According to the method of this invention, step b) includes developing a feature-weighted detector (FWD) network based on a semi-supervised learning scheme. Preferably, the FWD is a bidirectional connection neural network in which there are two types of connections. Memory connection trained by an unsupervised learning law represents summarization of learning results. Weight connection trained by a supervised learning law represents degree of importance of each feature. Input data presented to the input layer of the FWD are image features including effective diameter, degree of circularity, degree of irregularity, slope of the effective diameter, slope of degree of circularity, slope of degree of irregularity, average gradient, standard deviation of gradient orientation, contrast and net contrast. Each image feature applied to the input layer is normalized by using step a) to evaluate the value without effects of scale. After running, FWD gives a weight for each feature. The more meaningful a feature is, the larger its weight is; if the weight approaches zero, then the corresponding feature is meaningless.
Step c) includes using a gaussian clustering method (GCM) to pre-group nodules. Preferably, this pre-grouping method is based on a self-organizing algorithm by which sub-structures in data would be found without prior knowledge. Therefore, the method of step c) of the invention enables reducing false positives while maintaining recognition rate.
In suspect nodules (SNs) selected by the quick selection phase, the number of true positive nodules is much less than that of false positive nodules. Further, the majority of false positive nodules do not overlap, in terms of feature matching (i.e., xe2x80x9cfeature-wisexe2x80x9d), with true positive nodules. This brings about the difficulty of identifying the true positive nodules. In the present invention, therefore, two types of classifiers are introduced. Step d) refers to the first level classification that focuses on reducing those false positive nodules that do not overlap feature-wise with true positive nodules. Step e) performs the second level classification algorithm that focuses on reducing the rest of the false positive nodules that overlap feature-wise with true positive nodules. According to the method of the invention, the first level classifier comprises an optimized linear partition (OLP) technique. The problem of OLP is defined by an unconstrained quadratic objective function. After OLP, the optimal solution to reduction of false positive nodules would be found. That is to say, the method can reduce the number of false positives without loss of true positives. Preferably, OLP also can be used for each of a number of sub-groups of suspect nodules, and the processing time is considered almost negligible compared to that of using traditional neural networks.
In step e), the second level classifier is a fuzzy structure classifier (FSC) based on a semi-supervised learning scheme. Preferably, a FSC consists of a set of rules in IF-THEN form. The premise part of each rule expresses fuzzy structures in data. The consequence part of each rule expresses a clear relationship between inputs and output. Therefore, each rule can be described in natural language.
A xe2x80x9ccomputerxe2x80x9d refers to any apparatus that is capable of accepting a structured input, processing the structured input according to prescribed rules, and producing results of the processing as output. Examples of a computer include: a computer; a general purpose computer; a supercomputer; a mainframe; a super mini-computer; a mini-computer; a workstation; a micro-computer; a server; an interactive television; and a hybrid combination of a computer and an interactive television. A computer also refers to two or more computers connected together via a network for transmitting or receiving information between the computers. An example of such a computer includes a distributed computer system for processing information via computers linked by a network.
A xe2x80x9ccomputer-readable mediumxe2x80x9d refers to any storage device used for storing data accessible by a computer. Examples of a computer-readable medium include: a magnetic hard disk; a floppy disk; an optical disk, such as a CD-ROM and a DVD; a magnetic tape; a memory chip; and a carrier wave used to carry computer-readable electronic data, such as those used in transmitting and receiving e-mail or in accessing a network.
xe2x80x9cSoftwarexe2x80x9d refers to prescribed rules to operate a computer. Examples of software include: software; code segments; instructions; computer programs; and programmed logic.
A xe2x80x9ccomputer systemxe2x80x9d refers to a system having a computer, where the computer includes a computer-readable medium embodying software to operate the computer.
An xe2x80x9cinformation storage devicexe2x80x9d refers to an article of manufacture used to store information. An information storage device can have different forms, for example, paper form and electronic form. In paper form, the information storage device includes paper printed with the information. In electronic form, the information storage device includes a computer-readable medium storing the information as software, for example, as data.