Contemporary mobile devices such as smartphones and other devices are able to perform speech-to-text recognition. In general, processing the speech on the device (locally) does not provide results that are as good as sending the speech to a remote server for processing. This is generally because the remote server has more computing power and more model data compared to a mobile device. Further, in many instances, the remote server executes a more complex recognition program, such as one having the ability to not only recognize the speech as words, but to also consider the surrounding context of other words in recognizing the speech. Thus, many mobile devices are configured to use a remote server to perform the recognition.
However, as recognized by the inventor, sending voice data to a server can be relatively slow, particularly when a device is connected to a slow network. Even moderate amounts of speech take a long time to transmit, and thus cause the overall speech recognition process to seem extremely slow. In fact, the speech recognition process sometimes times out before the recognition results are obtained. Remote speech recognition with a slow network results in a poor user experience. Additional efficiency considerations such as the expense of sending large amounts of data (for users with limited data plans) further makes sending such large amounts of data undesirable.