The present invention relates to techniques and a framework for classifying medical images, such as mammograms, with weakly labeled (such as without local annotations of the findings) and imbalanced data sets, using deep convolution neural network (CNN) architecture.
Automated recognition of abnormalities in medical images has become an important medical diagnostic tool. Typically, automated recognition techniques are directed toward recognizing medical images containing a certain type of abnormality, and separating (classifying) them from the rest. Several conventional solutions have been proposed, which attempt to facilitate the diagnostic procedure by radiologists. Many of these approaches consist of a two-stage process: an initial detection or localization of potential abnormal candidates, and the posterior classification of the candidate belonging to a certain class. In mammography, these machine learning based methods rely on finely annotated data, requiring manual labeling that shows the lesion location in the image and often delineation of the boundaries. Such annotation is rarely available as current and past radiologist work flow does not digitally record this data in the system. The alternative of re-examining the entire data set for annotation can be extremely expensive, often making this process infeasible. However, the severity score or final diagnosis based on each mammogram is typically accessible via the medical records, and can be used as a global label for the whole mammogram.
Accordingly, a need arises for techniques by which medical images, such as mammograms, may be classified (for example as malignant versus Normal or benign) in an automated manner with minimal requirement for data annotation.