Deep learning, machine learning, latent-variable models, neural networks and other matrix-based differentiable programs are used to solve a variety of problems, including natural language processing and object recognition in images. Solving these problems with deep neural networks typically requires long processing times to perform the required computation. The conventional approach to speed up deep learning algorithms has been to develop specialized hardware architectures. This is because conventional computer processors, e.g., central processing units (CPUs), which are composed of circuits including hundreds of millions of transistors to implement logical gates on bits of information represented by electrical signals, are designed for general purpose computing and are therefore not optimized for the particular patterns of data movement and computation required by the algorithms that are used in deep learning and other matrix-based differentiable programs. One conventional example of specialized hardware for use in deep learning are graphics processing units (GPUs) having a highly parallel architecture that makes them more efficient than CPUs for performing image processing and graphical manipulations. After their development for graphics processing, GPUs were found to be more efficient than CPUs for other parallelizable algorithms, such as those used in neural networks and deep learning. This realization, and the increasing popularity of artificial intelligence and deep learning, led to further research into new electronic circuit architectures that could further enhance the speed of these computations.
Deep learning using neural networks conventionally requires two stages: a training stage and an evaluation stage (sometimes referred to as “inference”). Before a deep learning algorithm can be meaningfully executed on a processor, e.g., to classify an image or speech sample, during the evaluation stage, the neural network must first be trained. The training stage can be time consuming and requires intensive computation.