Deep learning has gained wide acceptance for its superior performance in the fields of computer vision, speech recognition, natural language processing, and bioinformatics. Deep learning is a branch of machine learning that uses artificial neural networks containing more than one hidden layer. One type of artificial neural network, called convolutional neural network (CNN), has been used by deep learning over large data sets such as image data. CNNs have shown excellent results in image applications. For example, CNNs can be used in feature extraction. From raw image pixels received at the input end, a CNN can generate scores for different classes of features at the output end.
Computational workloads of CNNs are intensive. The core computation of a CNN is convolution, which involves a high-order nested loop. For feature extraction, a CNN convolves input image pixels with a set of two-dimensional (2D) filters over a set of channels (e.g., red, green and blue), followed by nonlinear computations, down-sampling computations, and class scores computations. The convolution computations have been shown to be highly resource-demanding. In addition to the CNN, convolution computations are frequently used to solve scientific and engineering problems. Therefore, there is a need for optimizing the convolution computations to achieve performance improvement.