In the prior art, neural networks capable of action-selection have been well characterized, as have those that demonstrate reinforcement-learning. However, in the prior art, action-selection and reinforcement-learning algorithms present complex solutions to the distal reward problem, which are not easily amenable to hardware implementations.
Barr, D., P. Dudek, J. Chambers, and K. Gurney describe in “Implementation of multi-layer leaky integrator networks on a cellular processor array” Neural Networks, 2007. IJCNN August 2007. International Joint Conference, pp. 1560-1565, a model of the basal ganglia on a neural processor array. The software neural model was capable of performing action selection. However, Barr et al. did not describe any inherent mechanisms for reinforcement-learning and the micro-channels of the basal ganglia were predefined.
Merolla, P., J. Arthur, F. Akopyan, N. Imam, R. Manohar, and D. Modha describe in “A digital neurosynaptic core using embedded crossbar memory with 45 pj per spike in 45 nm” Custom Integrated Circuits Conference (CICC), September 2011 IEEE, pp. 1-4, a neuromorphic processor capable of playing a game of pong against a human opponent. However, the network was constructed off-line and once programmed on the hardware, remained static.
What is needed is a neural network that implements action-selection and reinforcement-learning and that can be more readily implemented with hardware. The embodiments of the present disclosure answer these and other needs.