1. Technical Field
The present disclosure relates to disease prediction and, more specifically, to automatic learning of image features to predict disease.
2. Discussion of Related Art
Computer aided diagnosis (CAD) pertains to the use of artificial intelligence to process medical image data and locate one or more regions of interest within the medical image data. These regions of interest may correspond to, for example, locations that are determined to be of an elevated likelihood for including an anatomical irregularity that may be associated with a disease, injury or defect. Often CAD is used to identify regions that appear to resemble lesions.
In general, CAD may be used to identify regions of interest that may then be inspected closely by a trained medical professional such as a radiologist. By utilizing CAD, a radiologist can reduce the chances of failing to properly identify a lesion and may be able to examine a greater number of medical images in less time and with improved accuracy.
There are many varying approaches for performing CAD. Some of these approaches utilize complex algorithms for detecting suspicious regions from normal regions. These algorithms may be manually programmed at great time and expense. However, other approaches rely on computer learning. In computer learning, a learning algorithm is provided with a set of training data that includes images in which a trained medical professional, such as a radiologist, has diagnosed a particular disease as well as images in which a radiologist has determined that the subject is free of the particular disease. By analyzing the set of images that are known to show the particular disease and the set of images that are known to not show the particular disease, the learning algorithm can determine how to differentiate between subsequent images that may or may not have the particular disease.
Moreover, such computer learning techniques may be used to differentiate between regions of a medical image that may be suspected of having a particular disease and regions of a medical image that may be free of the particular disease so that precise regions of suspicion may be identified within the medical image. The radiologist may then treat each detected region of suspicion as a lesion candidate and may render a final diagnosis based on the CAD results.
One way in which learning algorithms use training data to help identify regions of suspicion in subsequent medical images is to develop a set of image features that can predict the particular disease. Accordingly, learning algorithms may determine which image features are both highly represented in instances of actual lesions and yet poorly represented in the absence of lesions. Given sufficient training data, numerous useful image features may be developed.
In training these learning algorithms, it is beneficial to provide a large set of training data. Insufficient training data may result in ineffective search or detection algorithms, for example, insufficient and/or ineffective image features. However, obtaining sufficient training data can be a time consuming and expensive endeavor and may divert resources away from other important areas of development. This is because in order to provide training data, studies must be performed and/or clinical data must be manually reviewed for each particular disease that one wishes to be able to train the CAD system to detect. For example, if it is desired that the CAD system be trained to find lung nodule candidates, clinical data must be reviewed to find medical images with confirmed instances of lung nodules and to find other medical images with confirmed absence of instances of lung nodules. These images may then be provided to the learning algorithm as training data. As a large amount of training data must be collected to accurately train the CAD system, the training process can be very demanding. Moreover, where it is desired that the CAD system be able to detect multiple different forms of illness, the amount of training data to be identified and sorted can become enormous.