This relates to integrated circuits and, more particularly, to processing linear functions in pipelined storage blocks in an integrated circuit.
Consider a programmable logic device (PLD) as one example of an integrated circuit. As applications for which PLDs are used increase in complexity, it has become more common to design PLDs to include specialized blocks such as storage blocks in addition to blocks of generic programmable logic.
Storage blocks are often arranged in arrays of memory elements. In a typical array, data lines are used to write data into and read data from the storage blocks. Address lines may be used to select which of the memory elements are being accessed. A storage block in a PLD is typically configurable to implement a memory of a given depth and width, whereby the maximum depth is based on the number of address lanes and the maximum width on the number of data lanes.
Many common memory operations are executed inefficiently using these storage blocks. For example, read-modify-write operations where data is retrieved from memory, modified, and written back to memory may require several clock cycles to complete. Caching circuitry that keeps track of recent operations and ensures that only up-to-date data is used in subsequent operations is commonly used to work around the multi-cycle problem at the expense of increased circuit area. However, some applications are pushing for even higher speed and higher bandwidth and the current caching circuitry is not scalable.