Memory devices are typically provided as internal, semiconductor, integrated circuits in computing systems. There are many different types of memory including volatile and non-volatile memory. Volatile memory can require power to maintain its data (e.g., host data, error data, etc.) and includes random access memory (RAM), dynamic random access memory (DRAM), static random access memory (SRAM), synchronous dynamic random access memory (SDRAM), and thyristor random access memory (TRAM), among others. Non-volatile memory can provide persistent data by retaining stored data when not powered and can include NAND flash memory, NOR flash memory, and resistance variable memory such as phase change random access memory (PCRAM), resistive random access memory (RRAM), and magnetoresistive random access memory (MRAM), such as spin torque transfer random access memory (STT RAM), among others.
Computing systems often include a number of processing resources (e.g., one or more processors), which may retrieve and execute instructions and store the results of the executed instructions to a suitable location. A processing resource (e.g., CPU) can comprise a number of functional units such as arithmetic logic unit (ALU) circuitry, floating point unit (FPU) circuitry, and/or a combinatorial logic block, for example, which can be used to execute instructions by performing logical operations such as AND, OR, NOT, NAND, NOR, and XOR, and invert (e.g., inversion) logical operations on data (e.g., one or more operands). For example, functional unit circuitry may be used to perform arithmetic operations such as addition, subtraction, multiplication, and/or division on operands via a number of logical operations.
A number of components in a computing system may be involved in providing instructions to the functional unit circuitry for execution. The instructions may be executed, for instance, by a processing resource such as a controller and/or host processor. Data (e.g., the operands on which the instructions will be executed) may be stored in a memory array that is accessible by the functional unit circuitry. The instructions and/or data may be retrieved from the memory array and sequenced and/or buffered before the functional unit circuitry begins to execute instructions on the data. Furthermore, as different types of operations may be executed in one or multiple clock cycles through the functional unit circuitry, intermediate results of the instructions and/or data may also be sequenced and/or buffered.
In many instances, the processing resources (e.g., processor and/or associated functional unit circuitry) may be external to the memory array, and data is accessed via a bus between the processing resources and the memory array to execute a set of instructions. Processing performance may be improved in a processing-in-memory (PIM) device, in which a processor may be implemented internal and/or near to a memory (e.g., directly on a same chip as the memory array). A processing-in-memory (PIM) device may save time by reducing and/or eliminating external communications and may also conserve power.
A typical cache architecture (fully associative, set associative, or direct mapped) uses part of an address generated by a processing resource to locate the placement of a block in the cache and may have some metadata (e.g., valid and dirty bits) describing the state of the cache block. A last level cache architecture may be based on 3D integrated memory, with tags and metadata being stored on-chip in SRAM and the block data in quickly accessed DRAM. In such an architecture, the matching occurs using the on-chip SRAM tags and the memory access is accelerated by the relatively fast on-package DRAM (as compared to an off-package solution).