An automatic speech recognition (ASR) system translates recorded audio signal to text. It is a pattern classification problem at its core. However, both the nonstationarity of the signal and the large variation in the temporal dimension of the speech feature sequences prevent classical classifiers such as Bayes, nearest-neighbor, or state-of-the-art classifiers such as support vector machines (SVMs), which are limited to static patterns or fixed-dimension inputs, from being implemented in a straightforward manner.