This disclosure relates to a method for generating a prediction of a disease classification (or equivalently, diagnosis) error generated by a machine learning classifier for a microscope slide image. The predicted error is weighted by the degree to which portions of the image are out-of-focus (“OOF”).
In the medical field of histopathology, microscopic images of human tissue samples (which a prepared onto glass slides) are used for rendering cancer diagnosis. In classic histopathology, a tissue sample is diagnosed visually by an expert using a microscope. By contrast, in the newer sub-field of digital pathology, a high-resolution digital image of a sample is acquired by a whole-slide scanner first, and diagnosis is done in a subsequent step at a computer screen. Alternatively, the identification of cancerous cells in a tissue image can be aided by machine learning algorithms, typically embodied as deep convolutional neural networks, which are trained to find cancer cells in magnified tissue images. Such algorithms can generate so-called “heat map” images in which areas of the slide are shown in contrasting color, e.g., red, to indicate areas which are predicted to contain cancer cells.
Tissue images from whole-slide scanners are typically of gigapixel size (e.g. 100,000×100,000 pixels at 40× magnification). One of the main technical problems, however, is that regions of the digitized images can often be blurry and out-of-focus, rendering the respective image regions unusable for accurate diagnoses both by human pathologists as well as machine learning algorithms. Achieving accurate focus is particularly challenging for whole-slide scanners because (1) the depth of field is extremely thin due to the high objective power used to digitize the image, and (2) the tissue is often uneven and not in the same focus plane.
The depth of field is reciprocal to the magnification, accordingly the depth of field is only extremely thin at high magnifications. The depth of field, also denoted as “focus range”, especially at high magnifications, is often close to or even smaller than the thickness of the tissue to be captured. Moreover, the tissue sample is usually not perfectly planar, but uneven, and its thickness varies often too. Therefore, slide scanners usually employ a local auto-focus method while capturing images in smaller stripes or tiles, which are then digitally stitched together to form a whole-slide image. None of the auto-focus solutions employed by the different scanner manufacturers are perfect, but rather can fail in some image regions to keep the majority of the tissue within the focus range, and thus cause out-of-focus blur of varying degrees.
The main challenge for the auto-focus algorithm thereby is to distinguish between (a) blurriness in in-focus image regions caused by tissue with smooth appearance and (b) blurriness of any tissue pattern caused by varying degrees of out-of-focus. A secondary challenge is to prevent focusing on foreign particles on top of the “cover slip” (plastic or glass slide covering the tissue sample), such as dust or debris, which usually results in the tissue to be far outside the focus range.
Literature relating to the problems of quantifying the degree of out-of-focus for tissue images and related topics includes the following: G. Campanella et al., Towards machine learned quality control: A benchmark for sharpness quantification in digital pathology. Computerized Medical Imaging and Graphics (2017) https://doi.org/10.1016/j.compmedimag.2017.09.001; K. Kayser et al., How to measure image quality in tissue-based diagnosis (diagnostic surgical pathology), from 9th European Congress on Telepathology and 3rd International Congress on Virtual Microscopy, Toledo Spain Diagnostic Pathology 2008 3 (suppl. 1); J. Liao et al., Rapid focus map surveying for whole-slide imaging with continues [sic] sample motion , arXiv:1707.03039 [cs.CV] June 2017; S. Shakeri et al., Optical quality assessment of whole-slide imaging systems for digital pathology Optics Express Vol. 23, Issue 2, pp. 1319-1336 (2015); X. Lopex et al., An Automated Blur Detection Method for Histological Whole-slide Imaging, PLOS one (Dec. 13, 2013) https://doi.org/10.1371/journal.pone.0082710; Samuel Yang et al., “Assessing microscope image focus quality with deep learning”, BMC Bioinformatics (2018) 19:77, and M. Gurcan et al. Histopathological Image Analysis: A Review IEEE Rev Biomed Eng. 2009; 2: 147-171.
The present inventors have appreciated that the degree to which a slide is out-of-focus can impact the accuracy of machine learning diagnosis or cancer cell identification, and that there is a need to quantifying the error in a machine learning disease classifier that is focus-weighted, i.e., the error is specifically attributable to the degree to which portions of the microscope slide image are out-of-focus. This disclosure addresses this need.