Memory devices are typically provided as internal, semiconductor, integrated circuits in computers or other electronic systems. There are many different types of memory including volatile and non-volatile memory. Volatile memory can require power to maintain its data (e.g., host data, error data, etc.) and includes random access memory (RAM), dynamic random access memory (DRAM), static random access memory (SRAM), synchronous dynamic random access memory (SDRAM), and thyristor random access memory (TRAM), among others. Non-volatile memory can provide persistent data by retaining stored data when not powered and can include NAND flash memory, NOR flash memory, and resistance variable memory such as phase change random access memory (PCRAM), resistive random access memory (RRAM), and magnetoresistive random access memory (MRAM), such as spin torque transfer random access memory (STT RAM), among others.
Electronic systems often include a number of processing resources (e.g., one or more processors), which may retrieve and execute instructions and store the results of the executed instructions to a suitable location. A processor can comprise a number of functional units such as arithmetic logic unit (ALU) circuitry, floating point unit (FPU) circuitry, and/or a combinatorial logic block, for example, which can be used to execute instructions by performing logical operations such as AND, OR, NOT, NAND, NOR, and XOR, and invert (e.g., inversion) logical operations on data (e.g., one or more operands). For example, functional unit circuitry (FUC) may be used to perform arithmetic operations such as addition, subtraction, multiplication, and/or division on operands via a number of logical operations.
A number of components in an electronic system may be involved in providing instructions to the FUC for execution. The instructions may be generated, for instance, by a processing resource such as a controller and/or host processor. Data (e.g., the operands on which the instructions will be executed) may be stored in a memory array that is accessible by the FUC. The instructions and/or data may be retrieved from the memory array and sequenced and/or buffered before the FUC begins to execute instructions on the data. Furthermore, as different types of operations may be executed in one or multiple clock cycles through the FUC, intermediate results of the operations and/or data may also be sequenced and/or buffered.
In many instances, the processing resources (e.g., processor and/or associated FUC) may be external to the memory array, and data can be accessed via a bus between the processing resources and the memory array to execute instructions. Processing performance may be improved in a processor-in-memory (PIM) device, in which a processor may be implemented internal and/or near to a memory (e.g., directly on a same chip as the memory array), which may conserve time and power in processing. However, such PIM devices may still have various drawbacks. For example, such PIM devices may have a limited topology, which can make it difficult to shift (e.g., move) data in the memory. For instance, such PIM devices may only be able to shift bits of data one place at a time. As such, performing logical operations that involve a large amount of data shifting, such as, for instance, reduction and prefix sum operations, using such PIM devices can be a slow process.