The present disclosure relates to scalable architectures for implementing maximization algorithms, and more specifically, to scalable architecture for implementing maximization algorithms with resistive devices.
Information maximization algorithms are algorithms for optimizing artificial neural networks and other information processing systems. These types of algorithms may be implemented a function that maps a set of input values I to a set of output values O, which may be chosen or learned so as to maximize the average Shannon mutual information between I and O, subject to a set of specified constraints and/or noise processes. Some information maximization algorithms may be self-learning algorithms configured to optimize this process. Self-learning information maximization algorithms may self-improve without any teacher signals. The learning process may be described as setting matrix weight updates such that the output units become, statistically speaking, as independent as possible.
Some exemplary applications of information maximization algorithms may be demonstrated as “blind source separation” problems where a plurality of inputs are grouped together as single data source, and the inputs are then separated and analyzed individually as a function of the system architecture. For example, group of microphones randomly placed in a crowded room may pick up 10 voices with varying intensities, from 10 speakers, who are all in the room, each saying something different. In a blind source separation scenario, the information source to be maximized is the audio feed having the various voices from all of the microphones. The information maximization algorithm in this scenario may take the audio feed as a mixed input, determine who is speaking in the audio feed, and determine what each speaker is actually saying. The output of this exemplary algorithm may be 10 separate signals, each identifying the speaker and having the speaker's voice isolated as an independent source.
Current methods for computation of maximization algorithms often utilize von Neumann architecture. Accordingly, the time spent to complete the maximization task that includes matrix operations with N2, where N is the number of original and independent sources in the problem. In von Neumann architecture, the time factor needed for computation propagates quadratically with N because the matrix operations (such as vector-matrix multiplication) are computed serially by the processor. However, using conventional computing architecture, matrix operations on systems having a larger number of independent sources (e.g., N=1000 or more) may become computationally expensive for real-time (analog) computing applications.