Medical imaging techniques, such as computed topography (CT) and X-ray imaging, are widely used in diagnosis, clinical studies and treatment planning. There is an emerging need for automated approaches to improve the efficiency, accuracy and cost effectiveness of the medical imaging evaluation.
Chest X-rays are among the most common radiology diagnostic tests, with millions of scans performed globally every year. While the test is frequently performed, reading chest X-rays is among the more complex radiology tasks, and is known to be highly subjective, with inter-reader agreement varying from a kappa value of 0.2 to 0.77, depending on the level of experience of the reader, the abnormality being detected and the clinical setting.
Due to their affordability, chest X-rays are used all over the world, including in areas with few or no radiologists. In many parts of the world, the availability of digital chest X-ray machines is growing more rapidly than the availability of clinicians who are trained highly enough to perform this complex task. If automated detection can be applied in low-resource settings as a disease screening tool, the benefits to population health outcomes globally could be significant. One example of such use of chest X-rays is in tuberculosis screening, where chest X-rays, in the hands of expert readers are more sensitive than clinical symptoms for the early detection of tuberculosis.
Over the last few years, there has been increasing interest in the use of deep learning algorithms to assist with abnormality detection on medical images. This is a natural consequence of the rapidly growing ability of machines to interpret natural images and detect objects in them. On chest X-rays in particular, there have been a series of studies describing the use of deep learning algorithms to detect various abnormalities (Shin, et al., Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2497-2506, 2016; Rajpurkar, et al, arXiv preprint arXiv:1711.05225, 2017; Li, et al., arXiv preprint arXiv:1711.06373, 2017). Most of these have been limited by the lack of availability of large high-quality datasets, with the largest published work describing an algorithm that has been trained with 112,120 X-rays, a relatively small number considering that the majority of chest X-rays are normal, and abnormal X-rays are less common, with specific abnormalities being rarer still.