Computers often use speech recognition software to convert audio input into recognizable commands. For example, a mobile device may use speech recognition software to interpret a user's speech. Such speech recognition may be useful when a user is interacting with a digital assistant on a mobile device. The user's speech is received by the mobile device as an audio signal (by way of a microphone located on the mobile device, for example). The audio signal may then be processed by the mobile device or by a device communicatively coupled to the remote device.
A Deep Neural Network (DNN) is used to analyze multiple aspects of audio signals. For a DNN to be used in analysis of these aspects of the audio signal, the typical DNN requires the storage a substantial amount of information. For example, some DNN technologies have the ability to recognize a large set of vocabulary that usually consists of more than 6,000 senones (clustered triphone states) and 5-7 hidden layers, each with about 2,000 nodes. This leads to more than 30 million model parameters. Thus, some DNNs require a significant amount of computer resources both with respect to memory to store the model parameters and processing power to perform calculations related to these model parameters. As such, it remains desirous to develop DNN technology that can reduce memory and computing processing requirements while maintaining an adequate level of speech recognition.
It is with respect to these and other general considerations that aspects of the technology have been made. Also, although relatively specific problems have been discussed, it should be understood that the embodiments should not be limited to solving the specific problems identified in the background.