Conventional processor-readable non-volatile memory, such as FLASH, use transistors to store binary values (“0” or “1”). A deficiency with conventional non-volatile memory technologies is that a relatively high voltage is required to program elements of the memory (e.g., to transition state of a memory element from 0 to 1 or from 1 to 0). Specifically, approximately 10V is required to program a conventional FLASH memory element. With respect to mobile computing devices, such as laptop computing devices, tablet computing devices, mobile telephones, etc., use of this relatively large amount of voltage to program memory of such computing devices acts as a drain on a battery used to power these devices. In addition, with respect to conventional non-volatile memory elements used in conventional computing devices, these memory elements are difficult to scale, such that each memory element can have more than two states. As noted above, most conventional memory devices are binary. Some designs have been proposed that allow for a memory element to have multiple states—however, the number of states is relatively limited (e.g., four states instead of two).
Furthermore, conventional non-volatile memory elements are ill-suited for utilization in neuromorphic computing applications. Neuromorphic computers are configured to perform core matrix operations for neural network algorithms in parallel and within one computing cycle. When executing neural network algorithms, neuromorphic computers can theoretically overcome efficiency bottlenecks that are inherent to digital computers by using analog memory to both process and store weights in a neural network. Some analog memory devices have been proposed for use in neuromorphic computers. For instance, Resistive Random-Access Memory (RRAM) and phase change memory (PCM) have been proposed for use in neuromorphic computers. RRAM or PCM, however, require large voltages and large currents to program, and are additionally associated with non-linear programming, wherein a write operation from a first state to a second state gives a change in analog level, while altering from a second state to a third state etc. gives a significantly different change in analog level. Linear programming (where the change in resistance due to a write operation is independent of initial resistance) is needed for programming accuracy to support massively parallel training via an outer product update. Nonlinear RRAM or PCM devices cannot write “blind” and, instead, require feedback mechanisms in order to achieve accuracy, which sacrifices the underlying parallelism needed for energy-efficient training.
Using any two-terminal device (such as PCM or RRAM) for neuromorphic computing, faces a fundamental challenge because read and write functions of such types of memory are coupled through the same path and, therefore, are subject to the time-voltage dilemma. The activation energy for switching must be Ea>10 kT to preserve retention and prevent read disturb. In such case, a large, thermally activated barrier results in a super-exponential dependence of the programming current on the applied voltage, which leads to high voltages and currents that prevent scaling to arrays larger than approximately 100×100, and further contributes to nonlinear programming that reduces accuracy. Specifically, neither RRAM or PCM memory can achieve the high impedance required for scaling to arrays larger than 100×100 elements and simultaneously achieve the accuracy necessary for computation Crossbar arrays larger than 1000×1000 are required to balance out the costs of circuit overheads. Due to such limitations, then, proposed analog training accelerators composed of two-terminal devices, such as RRAM and PCM, have yet to achieve energy-efficient gains over complementary metal oxide semiconductors (CMOS).