Long Short Term Memory (LSTM) is a specific Recurrent Neural Network (RNN) architecture, and capable of learning long-term dependencies. A conventional RNN can learn from previous information, but it fails to learn long-term dependencies. LSTM explicitly adds memory controllers to decide when to remember, forget and output. It can make the training procedure much more stable and allow the model to learn long-term dependencies.
LSTM architecture is widely used in sequence learning, such as speech recognition. State-of-the-art model have millions of connections, and are both computationally and memory intensive. Deploying such bulky model results in high power consumption given latency constraint.
It has been widely observed that deep neural networks usually have a lot of redundancy, so network pruning has been widely studied to compress neural network including LSTM. However, most hardware accelerators mainly focus on deep neural network (DNN) or uncompressed LSTM.