The detection and localization, segmentation, of objects in images, both two dimensional and three dimensional, is commonly complicated by noise (both random and structured) and partial obscuration of objects of interest. The detection of lung nodules, as well as other pathologies and objects of interest, whether based on computer aided detection (CAD) or human observers, in chest radiographs is challenging. The detection and/or localization of other pathologies and objects of interest (e.g., catheters, feeding tubes, etc) in chest radiographs, particularly portable chest x-rays, is challenging and perhaps one the most difficult interpretation tasks in radiology. Such difficulties may arise due to, for example, one or more of the following factors: poor patient positioning, imaging area and habitués, image latitude and dynamic range, poor x-ray penetration, and perhaps most significantly, the presence of obscuring bones. The presence of the bone can lead to false diagnosis, false positives (FPs), false negatives and/or improper positioning of catheters and feeding tubes. These difficulties may arise due to the projection of a three dimensional object onto a two dimensional image. In lung nodule detection, in particular, false positives can arise from areas in the chest image where one rib crosses another or crosses another linear feature. Similarly, the clavicle bones crossing the ribs is another source of FPs. Even more significantly, overlapping bone may obscure the area underneath, resulting in a prominent source of false negatives. Furthermore, the profile of the nodule and/or other relevant pathologies or structures (e.g., catheters), may be modified by the overlaying rib, which may result in more difficult interpretation tasks for both machines and practitioners.
Several attempts have been made to solve this problem. In the context of CAD, the approach by Kenji Suzuki at University of Chicago is probably the most advanced. However, this has been achieved in an academic environment where tuning of the algorithm parameters can be made to fit the characteristics of the sample set. The particular method is based on a pixel-based artificial neural net that calculates a subtraction value for each pixel in the image based on the degree of bone density detected by the network. The result can be noisy, and the example implementation only worked for bones away from the outer part of the lung field. Based on the information provided in a paper by Suzuki, very little can be said about the performance of the approach; however, several inferences can be made. First, the method does not use a feature extraction process. This means that the method may not perform well on data that does not look very similar to its training images. Without feature extraction, a smooth approximation (good interpolation) is much harder to achieve. A second observation is that the method uses a rather simplistic approach for image normalization. Again, this implies that the approach may be susceptible to being too particular to its training images. This is not to suggest that the technique will altogether fail, but only that it is more difficult to be confident in later predictions. The authors have framed the algorithm as subtracting a weighted version of the predicted bone image (i.e., the subtraction values discussed above) from the original image. Therefore, by making this weight ever so smaller, one is simply moving more toward the posterior-anterior (PA) image rather than the desired soft tissue image. A final shortcoming is that the method explicitly leaves out the opaque area of the lung-field.
Loog, van Ginneken and Schilham published an approach in 2006 for suppressing bone structures based on feature extraction and local regression. The method works by first normalizing the image with an iterative application of local-contrast enhancement. This is followed by a feature extraction process, where the features are Gaussian 3-jets (a set of Gaussian derivatives at multiple scales up to order 3). This generates many features, and as a result, the authors employ a dimensionality reduction technique. The technique used is based on performing principle component analysis (PCA) on local regression coefficients. The authors use K-nearest neighbors regression (KNNR) for prediction of either the soft-tissue or bone images, possibly with an iterative application using the initial prediction as an additional feature. This approach would appear to have two major shortcomings: the first is that the prediction phase is entirely too computationally intensive and is likely inadequate. The second is that the approach for image normalization is likely grossly inadequate. KNNR is known as a “lazy learner,” which means that it uses proximity to training data as a means of making predictions. Unfortunately, even at a coarse resolution, a few images can generate many pixels (large training set). Therefore, for the routine to be even remotely practical, it would require a very sparse sampling of the training images. However, sparse sampling of training images could lead to issues in prediction, as nearest neighbor methods are notoriously bad interpolators. This would require a large value of K to compensate; however, too large a value of K leads to overly smoothed predictions (which would appear to be the case based on the images presented in the paper). Furthermore, the approach to image normalization is aimed at adjusting for gross global differences, while preserving and enhancing local details. The authors do this by iteratively applying a local contrast enhancement step. This step is potentially brittle in the presence of large non-anatomical artifacts (e.g., pacemakers) and allows for content outside the lung-field to have a heavy influence on pixel values inside. The latter point is important because content outside the lung-field can be highly variable (e.g., the presence of tags and markers).