A deep learning method is a type of machine learning method based on learning representations of data by using a deep graph with multiple processing layers. Such deep learning architectures include deep neural networks. A deep neural network (DNN) provides various functions such as image classification, speech recognition, and natural language processing. For example, Google's ALPHAGO™, a computer program for playing the board game “Go,” based on a deep convolutional neural network (deep CNN or DCNN), recently defeated the human world champion of “Go,” which suggests that complex tasks which were considered to be performed only by human beings can be solved by deep neural networks.
The depth of a neural network represents the number of successive layers in feedforward networks. A deeper neural network can better represent an input feature with lower complexity as compared to a shallow neural network. However, training deep networks is difficult due to vanishing/exploding gradient problems, and existing optimization solvers often fail with an increasing number of layers. Furthermore, the increasing depth of recurrent architectures like a gated recurrent unit (GRU) and long short term memories (LSTM) makes training of recurrent neural network (RNN) architectures more difficult because these architectures already have very deep representations in a temporal domain that may further aggravate vanishing/exploding gradient issues.