This invention is concerned with computing devices, and more particularly with devices for performing neural network processing.
Neural networks are well known. In most applications, neural network processing is performed by a suitably programmed general purpose processor. Thus, most implementations of neural networks are software-based. However, software-based neural network processing fails to take advantage of highly parallel features of typical neural network algorithms. Furthermore, typical neural network processing completely processes a single input event prior to receiving and processing the next event. Consequently, throughput may be low.
According to a first aspect of the invention, a pipelined hardware implementation of a neural network circuit is provided. The inventive neural network circuit includes an input stage for receiving and storing input values, a first processing stage coupled to the input stage, at least one additional processing stage coupled to an upstream processing stage, and an output stage. (The upstream processing stage to which the additional processing stage is coupled may be the first processing stage.) The first processing stage includes a plurality of first processing units. Each first processing unit includes a weight store for storing a plurality of weighted values, a plurality of multipliers each for multiplying an input value by a respective weighted value, an adder for adding a product outputted from one of the multipliers with at least one product outputted from a respective multiplier of another one of the plurality of first processing units, a function circuit for receiving a sum outputted by the adder and for generating therefrom a processing unit value, and a register for storing the processing unit value generated by the function circuit. The additional processing stage includes one or more additional stage processing units. Each additional stage processing unit includes a weight store for storing a plurality of weighted values, a plurality of multipliers each for multiplying a processing unit value received from a processing unit of the upstream processing stage by a respective weighted value, an adder for adding a product outputted from one of the multipliers of the respective additional stage processing unit with at least one product outputted from a respective multiplier of another one of a plurality of additional stage processing units, a function circuit for receiving a sum outputted by the adder of the respective additional stage processing unit and generating therefrom a processing unit value, and a register for storing the processing unit value generated by the function circuit of the respective additional stage processing unit. The output stage is formed from output ports of the registers of the additional processing stage.
At least one intervening processing stage may be coupled between the first processing stage and the additional processing stage.
The additional processing stage performs calculations with respect to a first set of input values at the same time that the first processing stage performs calculations with respect to a second set of input values.
The neural network circuit also includes circuitry for loading the weighted values into the weight stores.
In accordance with a second aspect of the invention, a pipelined hardware implementation of a recall-only neural network circuit is provided. The inventive neural network circuit includes an input stage adapted to receive and store at least one input value, and a first processing stage coupled to the input stage. The first processing stage includes at least one processing unit having (1) a weight store adapted to store at least one weighted value; (2) at least one multiplier adapted to multiply an input value by a respective weighted value; (3) a function circuit coupled downstream from one or more of the at least one multiplier and adapted to receive a function input and to generate therefrom a processing unit value; and (4) a register adapted to store the processing unit value generated by the function circuit.
The neural network circuit also includes an additional processing stage coupled to an upstream processing stage. The additional processing stage includes at least one additional stage processing unit having (1) a weight store adapted to store at least one weighted value; (2) at least one multiplier adapted to multiply a processing unit value received from a processing unit of the upstream processing stage by a weighted value; (3) a function circuit coupled downstream from one or more of the at least one multiplier of the respective additional stage processing unit and adapted to receive a function input and to generate therefrom a processing unit value; and (4) a register adapted to store the processing unit value generated by the function circuit of the respective additional stage processing unit. The neural network circuit also includes an output stage including an output port of the register of the additional processing stage.
The neural network circuits of the present invention provides rapid and efficient processing of input data sets with high throughput.
Other objects, features and advantages of the present invention will become more fully apparent from the following detailed description of exemplary embodiments, the appended claims and the accompanying drawings.