Radiologists use radiographic images such as mammograms to detect and pinpoint suspicious lesions in a patient as early as possible, e.g., before a disease is readily detectable by other, intrusive methods. As such, there is real benefit to the radiologist being able to locate, based on imagery, extremely faint lesions and precursors. Large masses of relatively dense tissue are one signature of concern. Although some masses can appear quite prominent in a radiographic image, various factors including occlusion/partial occlusion by other natural structure, appearance in a structurally “busy” portion of the image, sometimes coupled with radiologist fatigue, may make some masses hard to detect upon visual inspection. One thing that can help identify a suspicious mass, particularly when its central bulge is difficult to see, is a spiculation pattern surrounding the mass. The spiculation pattern can appear in a radiographic image as a pattern of tissue that appears “drawn in” toward a central point.
Computer-Aided Detection (CAD) algorithms have been developed to assist radiologists in locating potential lesions in a radiographic image. CAD algorithms operate within a computer on a digital representation of the mammogram set for a patient. The digital representation can be the original or processed sensor data, when the mammograms are captured by a digital sensor, or a scanned version of a traditional film-based mammogram set. An “image,” as used herein, is assumed to be at least two-dimensional data in a suitable digital representation for presentation to CAD algorithms, without distinction to the capture mechanism originally used to capture patient information. The CAD algorithms search the image for objects matching a signature of interest, and alert the radiologist when a signature of interest is found.
Classification of anomalies may be performed using a probability density function (PDF) that describes the relative likelihood of observing any given sample value of a random variable. The integral of a PDF over all possible values is 1; the integral of a PDF over a subset of the random variable's range expresses the probability that a drawn sample of a random variable will fall within that range.
PDFs that can be expressed by a closed-form equation are generally well understood, and many applications for such PDFs have been developed. On the other hand, the practical estimation of a PDF for a complex multidimensional random variable, particularly one with an unknown and possibly irregular distribution in each dimension, and/or long, sparsely populated tails, has in large part eluded researchers. In the area of pattern and image recognition, for instance, many researchers have abandoned PDF approaches and concentrated on known solvable alternatives, such as Neural Networks and linear discriminant functions, due to the practical difficulties in applying a PDF approach.