The present invention, in some embodiments thereof, relates to a system and method for detection and classification of multiple findings in images and, more specifically, but not exclusively, to an automatic system and method for detection and classification of multiple findings in medical images.
In medical image analysis an expert radiologist typically examines one or more images, for example images captured by computerized tomography, ultrasound or mammography, and analyses the one or more images to detect and classify potential abnormalities. Detection and classification of abnormalities is complicated by factors such as there being multiple categories of abnormalities and variability in appearance of abnormalities, such as size, shape, boundaries and intensities, as well as factors such varying viewing conditions and anatomical tissue being non-rigid. Examples of abnormality categories are lesions, calcifications, micro-calcifications and tumors.
Image classification is the task of taking an input image and outputting a class or one or more features (a cat, a dog, etc.) or a probability of classes (features) that best describes the image. The term Deep Learning (DL) is used to refer to methods aimed at learning feature hierarchies, with features from higher levels of a hierarchy formed by composition of lower lever features. For example, in computer vision, a feature may be a certain visible object or a certain visible characteristic of an object. Examples of low level features are a color, a texture, an edge and a curve. Examples of high level features are a dog, a cat, a car and a person. In an example of hierarchical features in computer vision, a dog may be identified by a composition of one or more paws, a body, a tail and a head. In turn, a head may be identified by a composition of one or two eyes, one or two ears and a snout. An ear may be identified by a certain shape of an outlining edge. Automatically learning features at multiple levels of abstraction allows a system to learn complex functions mapping an input image to an output classification directly from data, without depending completely on human-crafted features.
As appropriate technologies develop, computer vision techniques are employed to assist radiologists in detection and classification of abnormalities in medical images. The term neural network is commonly used to describe a computer system modeled on the human brain and nervous system. A neural network usually involves a large number of processing objects operating in parallel and arranged and connected in layers (or tiers). A first layer receives raw input information (one or more input records), analogous to optic nerves in human visual processing. Each successive layer receives an output from one or more layers preceding it, rather than from the raw input—analogous to neurons further from the optic nerve receiving signals from neurons closer to the optic nerve. A last layer produces an output of the neural network, typically one or more classes classifying the raw input information. In computer vision the raw input information is one or more images, and the output is one or more feature classifications detected in the image. In recent years, deep neural networks (DNN) are used for image recognition, specifically convolutional neural networks (CNN). The term “deep” refers to the amount of layers in such a neural network. A CNN in computer vision is a neural network having a series of convolutional layers which apply one or more convolutional filters to a digital representation of an input image. For each sub-region of the image, each of the plurality of convolutional layers performs a set of mathematical operations to produce a single value in an output feature map, representing one or more spatial features in the image. The CNN is able to perform image classification by looking for low level features such as edges and curves, and then building up to more abstract concepts through the series of convolutional layers.
Such technologies have been applied successfully in a plurality of fields, including for classifying lesions and calcifications in breast mammography images. However, commonly used methods perform either feature classification or detection localization (determining coordinates of the feature with reference to the input image) but not both.