Deep learning neural network systems can increase efficiency of medical diagnosis and decrease physician work load by providing a diagnosis of new medical images. The neural network systems must first be trained or developed.
The development of deep learning neural network systems for visual regression and recognition tasks requires extremely large amounts of visual data accompanied by discrete category labels based on knowledge of specific taxonomies. Such data, however, is not readily present in a format that is conducive to neural network training and testing.
In particular, medical text books can contain large amounts of medical images and useful information relating to such images. Using these images for machine learning, deep learning, training, or testing neural networks, however, presents challenges, because the images are generally not in a format conducive to be used for such applications.
For example, these medical images are often accompanied by related text, for example, text-based captions that describe findings related to the medical images, which are helpful, but only if processed in a way that can be used to develop the neural networks.
Furthermore, the images often have markings, text, or annotations that lay over the medical images. These image artifacts can throw off digital image analysis because they may show up as abnormalities or otherwise pollute the analysis or image recognition process.