DNNs are attracting attention as a method of machine learning. DNNs have been applied to image recognition, speech recognition and so forth, and their superior performances over conventional approaches have been reported, where an error rate, for example, is improved by about 20 to 30% (Non-Patent Literatures 1-3).
We can consider a DNN to be a neural network having layers larger in number than before. Specifically, a DNN includes an input layer, an output layer, and a plurality of hidden layers provided between the input and output layers. The input layer has a plurality of input nodes (neurons). The output layer has neurons the number of which corresponds to the number of objects to be identified. Hidden layers each have a plurality of neurons. Pieces of information propagate from the input layer to the hidden layers one by one and eventually outputs are provided at output nodes. Because of this scheme, the number of nodes included in the output layer tends to be larger than in other layers.
In a DNN, the number of layers is large and the number of neurons in each layer is also large. Therefore, the amount of computation for learning could be enormous. Previously, such computation has been almost impossible. Nowadays, computers have higher computing capabilities, and distributed/parallel processing techniques and computational theory are so developed as to allow DNN learning. When a huge amount of data is to be used for training, however, it still takes a long time for learning. By way of example, in an experiment described in Non-Patent Literature 4, a DNN learning using 10 million images of 200×200 pixels as training data took three days by 1,000 machines of 16 cores.