Modern speech recognition systems are based on principles of statistical pattern recognition and typically employ an acoustic model and a language model to decode an input sequence of observations (also referred to as acoustic events or acoustic signals) representing an input speech (e.g., a sentence or string of words) to determine the most probable sentence or word sequence given the input sequence of observations. In other words, the function of a modern speech recognizer is to search through a vast space of potential or candidate sentences and to choose the sentence or word sequence that has the highest probability of generating the input sequence of observations or acoustic events. In general, most modern speech recognition systems employ acoustic models that are based on continuous density hidden Markov models (CDHMMs). In particular, CDHMMs have been widely used in speaker-independent LVCSR because they outperform discrete HMMs and semi-continuous HMBs. In CDHMMs, the probability function of observations or state observation distribution is modeled by multivariate mixture Gaussians (also referred to herein as Gaussian mixtures) which can approximate the speech feature distribution more accurately. The purpose of a language model is to provide a mechanism for estimating the probability of a word in an utterance given the preceding words. Most modern LVCSR systems typically employ some forms of N-gram language model which assumes that the probability of a word depends only on the preceding (N-1) words. N is usually limited to 2 (for a bi-gram model) or 3 for a tri-gram model). In a typical LVCSR system, the size of the language model is usually very large. For example, for a Chinese dictation system having a vocabulary size of about 50,000 words, the size of a typical tri-gram language model file is about 130 Mbytes and the size of a typical bi-gram look-ahead language model file is about 109 Mbytes. Since the language models files are usually very large, it is difficult to load such files directly into memory because of insufficient physical memory in most of the desktop machines. One solution is to use a memory-map file format to access language model files in LVCSR system. Accessing the language model files using memory-map file format is slower than loading the language model files directly into memory. In addition, because a LVCSR system accesses language model files randomly, searching such a big space randomly also costs much time. In short, the large size of language model files in a LVCSR system can negatively impact the system performance in terms of memory requirement and run-time speed.