Speech signal may correspond to a voice signal that may include pronunciation of a sequence of words. With advent of speech signal processing, various automatic speech recognition (ASR) techniques, such as a large-vocabulary continuous speech recognition (LVCSR), which uses a triphone acoustic model, have been developed that may enable the extraction of keywords from the speech signal. The extracted keywords may be utilized in various application areas such as, but are not limited to, speech to text conversion (STT), determination of sentiments of a person, speech analytics, and/or the like.
Usually, the ASR techniques such as the LVCSR require a language model of bi-grams and tri-grams of a set of words. The speech signal (to be analyzed) is searched through the dictionary for identification of the keywords. As the speech signal is searched through the entire dictionary, the identification of the keywords in the speech signal may be computationally expensive. Therefore, the identification of the keywords from the speech signal in real time may not be feasible.