High-content screening (HCS) technologies that combine automated fluorescence microscopy with high-throughput biotechnology have become powerful systems for studying cell biology and for drug screening. However, these systems can produce more than 105 images per day, making their success dependent on automated image analysis. Traditional analysis pipelines heavily rely on hand-tuning the segmentation, feature extraction and classification steps for each assay. Although comprehensive tools have become available, they are typically optimized for mammalian cells and not directly applicable to model organisms such as yeast and Caenorhabditis elegans. Researchers studying these organisms often manually classify cellular patterns by eye.
Recent advances in deep learning indicate that deep neural networks trained end-to-end can learn powerful feature representations and outperform classifiers built on top of extracted features. Although object recognition models, particularly convolutional networks, have been successfully trained using images with one or a few objects of interest at the center of the image, microscopy images often contain hundreds of cells with a phenotype of interest, as well as outliers.
Fully convolutional neural networks (FCNNs) have been applied to natural images for segmentation tasks using ground truth pixel-level labels. These networks perform segmentation for each output category instead of producing a single prediction vector. For microscopy data, convolutional sparse coding blocks have also been used to extract regions of interest from spiking neurons and slices of cortical tissue without supervision. Other approaches utilize FCNNs to perform segmentation using weak labels. However, while these techniques aim to segment or localize regions of interest within full resolution images, they do not classify populations of objects in images of arbitrary size based on only training with weak labels. These techniques suffer because dense pixel level ground truth labels are expensive to generate and arbitrary, especially for niche datasets such as microscopy images.
Thus, there is a lack of automated cellular classification systems using full resolution images. Applying deep neural networks to microscopy screens has been challenging due to the lack of training data specific to cells; i.e., a lack of large datasets labeled at the single cell level.