Artificial intelligence (AI) can enable computers to perform various complicated tasks, such as tasks related to cognitive functions that are typically associated with humans. Several approaches to AI are prevalent, including machine learning techniques. In machine learning systems, a computer may be programmed to parse data, learn from the data, and make predictions from real-world inputs. Some machine-learning algorithms may use known data sets to train a computer to perform a task rather than explicitly programming the computer with a particular algorithm for performing the task. One machine-learning model, referred to as an artificial neural network, was inspired by the interconnections of neurons in a biological brain.
Neural networks are modeled after neurons, using connected layers similar to connected neurons. Each layer may receive an input, process the input, and pass an output to the next layer until the final layer produces a final output. Each layer may also assign a weight to its input. For example, if a task involves identifying a particular object in an image, filter weights may be trained to correspond to a probability that the input matches the particular object. While calculations performed at these various layers may be computationally intensive, the advent of dedicated processing units has made processing these neural network layers more feasible, especially for complex tasks related to computer vision or natural language processing.
However, even with the use of specialized processing hardware, such as accelerators that perform the computations of each network layer, deep learning may tax existing computing systems, including those with highly efficient matrix-multiplication units. Since AI and other systems are often heavily dependent on vector- and matrix-multiplication operations (e.g., dot-product operations), what is needed, therefore, are improved systems for performing matrix-multiplication operations.