Modern computer systems usually includes one or more processors, one or more memory systems, and input/output mechanisms. A processor typically needs to access the memory system for instructions and data to perform operations. When evaluating the performance of a computer system, it is important to factor in the number of cycles needed to access instructions and data from memory, i.e., latency, because the number of cycles needed to access instructions and data from memory can directly affect the total number of cycles it takes to perform an instruction.
To alleviate the adverse impact on performance from the long latency associated with accessing main memory (e.g., random access memory (RAM)), a memory system can include multiple levels of cache memory, e.g., L1, L2, L3, etc. A cache memory is a smaller but faster memory which stores copies of data for frequently accessed main memory locations. L1 or Level-1 cache is the closest component of the memory system to the processor, which can typically return data for a memory access in one cycle. The lower levels of memory (e.g., L2, L3, main memory, and so forth) typically have longer latency than L1 memory. When the processor can fetch instructions or data from cache memory instead of from main memory for at least some of the memory accesses, the overall average latency to access memory can be significantly decreased and the performance of the computer system can be improved.
Overview
Digital signal processors often operate on two operands per instruction. For the best performance, it is desirable for the processor to retrieve both operands in one cycle. Conventional caches for digital signal processors connect to the processor over two busses and internally, it typically uses two or more memory banks to store cache lines. The allocation of cache lines to specific banks is based on the address of the data that the cache line is associated. In other words, the address of the operands eventually determines which bank within the cache they will have to be fetched from. With this architecture, the cache is able to cater to the need of servicing two access requests in a single cycle in some to most of the cases. However, if the two memory accesses map to the same memory bank, fetching the operands incurs extra latency because the accesses are serialized. The present disclosure describes an improved bank organization for providing conflict-free dual-data cache access—a bus-based data cache system having two data buses and two memory banks. Each memory bank works as a default memory bank for the corresponding data bus. As long as the two values of data being accessed belong to two separate data sets assigned to the two respective data buses, memory bank conflicts can be avoided.