Automatic speech recognition (ASR) systems often utilize recurrent neural networks (RNNs) as acoustic models to provide word hypothesis scores. These RNNs can become unstable, however, after executing for some period of time, resulting in a decrease in recognition accuracy. This instability may be associated with certain network training methodologies or may be due to other inherent numerical instabilities of the neural network. Existing systems generally handle this problem by periodically resetting or re-initializing the RNN after a pre-defined execution time interval. This solution, however, is not optimal since recognition accuracy will decrease if the chosen time interval is too long, and computational efficiency will be impacted if the interval is too short. Existing systems also generally use a pre-defined quantity of training data to perform the reset, which may not necessarily be sufficient to provide adequate context for the network to properly re-initialize, or may be too large resulting in slower resets.
Although the following Detailed Description will proceed with reference being made to illustrative embodiments, many alternatives, modifications, and variations thereof will be apparent in light of this disclosure.