Deep learning is a branch of machine learning and recently has become a very hot topic in both academia and industry. Convolutional Neural Network (CNN) is probably the most successful model of deep learning and has made many breakthroughs in various fields like computer vision and speech recognition.
In the past few years, Convolutional Neural Networks (CNNs) have been succeeded in various machine-learning domains. The traditional method to train CNNs has merely relied on a Graphics Processing Unit (GPU). In this scenario, the Central Processing Unit (CPU) is responsible for reading data and sending to the data to the GPU, and the GPU will finish the training process. There are several disadvantages of this de factor approach: (1) CPU computation power is wasted (2) The model sizes of CNNs are restricted because GPU has a very limited global memory.
Conventional methods for training CNNs either use the CPU or GPU, not together. Because of the high throughput of the graphic processing unit (GPU), using GPUs in training CNNs has become a standard approach. There are a lot of frameworks or tools (e.g. cuda-convnet, Caffe, Torch7) which use a GPU as the computing backend to accelerate the training of CNN models.