Field of the Invention
The invention relates to data processing, and more particularly, to a method for sparse matrix by vector multiplication for use in any layer having multiple neurons with multiple inputs in an artificial neural network.
Description of the Related Art
An artificial neural network (ANN) is based on a collection of connected neurons. When processing and propagating input signals, the input values (also called “synapse values”) supplied to the neuron's synapses are each modulated by the synapses' respective weight values. The effect of this process is to pass a portion of the synapse value through the synapse, which is proportional to the weight value. In this way, the weight value modulates the connection strength of the synapse. The result is then summed with the other similarly processed synapse values.
Matrix by vector multiplication (M×V) is a basic build block in artificial neural networks and deep learning applications. For instance, in a general ANN, a layer having a plurality of neurons with multiple inputs (i.e., each neurons having multiple inputs) performs a computation: b=f(Wa+v), where a is an input vector, b is an output vector, v is a bias, W is a weight matrix and f is a transfer function; thus, the layer having the neurons the with multiple inputs are implemented with M×V. In convolutional neural networks, fully connected (FC) layers are implemented with M×V, and a very high percentage of the connections are occupied by FC layers; in recurrent neural networks, M×V operations are performed on the new input and the hidden state at each time step, generating a new hidden state and an output.
In general, the M×V procedure is a complex procedure and consumes a lot of computational resources. In particular, the weight matrices that occur in a general ANN system are often very large and sparse. For example, for a typical FC layer like FC7 of VGG-16, the input vector is 4K long and the weight matrix is 4K×4K (16 M weight values). A matrix is called sparse when it contains a small amount of non-zero entries/elements. In a general ANN system, it takes much time operating on and transferring the large amount of zero entries in the sparse weight matrix, and requires a huge and redundant storage space for zero entries in the sparse weight matrix, which increases storage cost and reduces M×V operation efficiency.
The invention is directed towards providing improved efficiency in M×V operations for facilitating data processing in a general ANN system.