The present invention relates to anatomical object detection using deep neural networks, and more particularly, to approximating deep neural networks for anatomical object detection.
One of the biggest challenges in machine learning and pattern recognition is the curse of dimensionality. The curse of dimensionality refers to the notion that the complexity of a learning problem grows exponentially with a linear increase in the dimensionality of the data. For this reason, data is commonly pre-processed by dimensionality reduction techniques of feature extractions in order to extract a meaningful and compact representation of the data that can be effectively handled by machine learning classifiers. Accordingly, data representation is an important factor that affects the performance of artificial intelligence systems.
Deep learning mimics the behavior of mammal brains in order to extract a meaningful representation from a high dimensional input. Data is passed through multiple layers of a network. The primary layers extract low-level cues, such as edges and corners for natural images. Deeper layers compose simple cues from previous layers into higher-level features. In this way, powerful representations emerge at the end of the network. The gradual construction of a deep network prevents the learning from be exposed to a high complexity of data too early. Several theoretical works show that certain classes of functions (e.g., indicator function) could be represented by a deep network, but require exponential computation for a network with insufficient depth.
Recently, deep learning has been applied with high accuracy to pattern recognition problems in images. However, the benefits of deep networks come at the cost of high computational costs during the evaluation phase. In particular, fully connected deep networks are orders of magnitude slower than traditional machine learning classifiers, such as linear support vector machines (SVM) and random forest classifiers. It is desirable to improve the run-time speed for deep networks to make such deep learning technology more practical for various applications including for light-computing platforms, such as mobile phones and computer tablet devices.
Various approaches have been attempted to improve the computations aspect of deep learning. Graphics processing units (GPUs) have been shown to speed up training by several orders of magnitude. However, most widely used computing devices are not equipped with a powerful GPU. Another way to speed up such deep networks is convolutional networks such as convolutional neural nets (CNNs) or convolutional deep belief nets. Separable filters could also be used to improve the speed of convolutional networks. However, these approaches require data to have tensor structures which limits the scope of the application of such deep learning technology. In addition, convolutional networks and separable filters can degrade the overall classification accuracy due to the structure that they impose on the filters of the deep network.