A neural network can be represented as two or more layers of interconnected “neurons” (or “nodes”) that can exchange data between one another. The connections between the neurons can have numeric weights that can be tuned based on experience. Such tuning can make neural networks adaptive and capable of “learning.”
A deep neural network is a neural network that has one or more hidden layers of neurons between an input layer and an output layer of the neural network. Such layers between the input layer and the output layer may be referred to as “hidden” because they may not be directly observable in the normal functioning of the neural network. A deep neural network can include any number of hidden layers, and each hidden layer can include any number of neurons.
One type of deep neural network is a recurrent neural network (RNN), such as the RNN 1500 shown in FIG. 15. As shown, the RNN 1500 includes an input layer 1502, a hidden layer 1504, and an output layer 1506. The hidden layer 1504 has one or more feedback loops. These feedback loops can provide RNNs with a type of “memory,” in which past outputs from the hidden layer 1504 can inform future outputs from the hidden layer 1504. Specifically, each feedback loop can provide an output from the hidden layer 1504 at a previous time-step (e.g., t−1) back to the hidden layer 1504 as input for the current time-step (e.g., to) to inform the output at the current time-step. This can enable RNNs to recurrently process sequence data (e.g., data that exists in an ordered sequence, like a sentence having a sequence of words or a video having a sequence of images) over a sequence of time steps.
One type of RNN is a Long Short-Term Memory (LSTM) neural network, such as the LSTM neural network 1600 shown in FIG. 16. As shown, the LSTM neural network 1600 includes input nodes (e.g., X0-X3), a fully connected RNN, and output nodes (e.g., H0-H3). The LSTM neural network 1600 can include one or more memory cells, such as memory cell 1602. The memory cell 1602 can enable the LSTM neural network 1600 to have a longer memory than other types of RNNs. The memory cell 1602 can include one or more gates. Each gate can include a sigmoid neural-network layer (e.g., depicted in memory cell 1602 with a “σ” symbol) and/or a pointwise multiplication. Typically, the memory cell 1602 includes a self-recurrent connection, an input gate, a forget gate, an output gate, or any combination of these. Examples of these gates include input gate 1606, forget gate 1604, and output gate 1608 shown in FIG. 16. The input gate 1606 can selectively control the input to the memory cell 1602. The output gate 1608 can selectively control the output of the memory cell 1602. The forget gate 1604 can control whether the memory cell 1602 remembers information from previous time-steps when processing sequence data. For example, the forget gate 1604 can control whether the memory cell 1602 should save the previous state of the memory cell 1602 for a period of time or forget the previous state of the memory cell 1602.