Precise detection of invasive regions of cancer on a whole-slide image (WSI) is a critical first step in enabling subsequent further interrogation of tumor differentiation using standard grading schemes. WSIs used in histopathology are typically large. For example, a typical WSI may have a spatial resolution of 80,000 pixels by 80,000 pixels and require 20 GB to store. Furthermore, digital slide repositories, such as the Cancer Genome Atlas (TCGA) may host images acquired from thousands of cancer studies performed by different institutions, amounting to petabytes of data that may be analyzed. This high volume of data requires high throughput computational image analysis techniques to effectively utilize the data in clinical applications.
Representation and deep learning approaches may be used for interpretation and analysis of images. Representation and deep learning approaches may be used instead of other computer vision approaches for tasks including object detection, object recognition, and image annotation. Deep representation learning refers to a family of machine learning methods that attempt to learn multiple levels of representation to model complex relations among data. Deep representation learning methods attempt to discover more abstract features via higher levels of representation. Convolutional neural networks (CNN) are a type of deep representation learning method that may be used for image analysis. CNNs are multilayer neural networks that combine different types of layers (e.g. convolutional, pooling, classification) that are trained in a supervised manner for image analysis and classification tasks.
Conventional approaches employing CNNs for image classification and object detection have focused on very small images. Some conventional approaches have applied CNNs to histopathology image analysis, including analysis of WSIs. However, conventional approaches to image analysis using CNNs have limited their analysis to small regions of interest (ROI) within the larger WSI. The overall size of a CNN depends on the size of the input image. For example, a CNN with an input image having dimensions of 200 pixels by 200 pixels and 250 feature maps in the first convolutional layer would involve ten million hidden units. In contrast, the same CNN architecture with an input red-green blue (RGB) color model image of 80,000 pixels by 80,000 pixels (e.g. a typical digitized WSI) would require approximately 4.8 trillion hidden units, which exceeds the computational capabilities of contemporary high performance computing clusters by several orders of magnitude. Consequently, the direct application of conventional CNN approaches to object detection or pixel-level classification in WSIs is not tenable in clinically relevant time-frames.
Breast cancer (BCa) is the most common type of cancer in women and the second leading cause of death in developed countries. Invasive BCa refers to those breast cancers that have spread from the original site and which tend to have poorer prognosis than less invasive BCa. Precise invasive tumor delineation on a pathology slide is typically the first step for subsequent interrogation of tumor differentiation. Conventional approaches to BCa grading have first required a definition of the target ROI on a WSI by an expert human pathologist. Thus, conventional approaches are limited by the availability of expert human pathologists, and by inter-reviewer subjectivity.