The present invention relates generally to integrated circuit memory devices and, more particularly, to a computational memory implemented from a content addressable memory (CAM) such as a ternary CAM (TCAM).
Existing computer designs typically provide a direct connection between a processor and its associated memory components. In conventional designs, data values are exchanged between the processor and the memory components, which contain load/store addresses and load/store data objects going in and out of the processor.
In order to improve the computational power of processors such as microprocessors, a processing element or arithmetic logic unit (ALU) may be positioned as close as possible to the source of the data (e.g., a memory array) to promote a high data bandwidth between the two structures. Thus, modern microprocessors commonly feature large capacity memories next to the ALU in the form of, for example, L1 , L2 and other caches. Although this added memory improves performance, it also increases the die area, and thus, the cost of each microprocessor chip.
Other attempts at increasing the computational speed of a processing element involve placing a one-bit SIMD (Single-Instruction Stream Multiple-Data Stream) processor within the memory circuitry, adjacent to sense amplifiers in both SRAM (Static Random Access Memory) and DRAM (Dynamic Random Access Memory) arrays. However, for small memories, the overhead of this bit-wise ALU approach is high. In addition, the operands need to be read out one at a time, and only then can the result be computed in the ALU attached to the sense-amplifier.