1. Background Field
The present invention relates to memory systems and in particular to cache and memory hierarchy.
2. Relevant Background
Processors, such as microprocessors, digital signal processors, and microcontrollers, are generally divided into many systems and sub-systems, such as a memory system, a processing unit, and load store units. The load store unit transfers data between the processing units and the memory system. Specifically, the load store unit reads (i.e. loads) data from the memory system and writes (i.e. stores) data to the memory system. To improve performance, memory systems generally have a memory hierarchy using one or more level of caching.
FIG. 1 shows a simplified block diagram of a load store unit 110 coupled to a memory system 140. Load store unit 110 includes an instruction decoder 111, a load scheduler 113, a load pipeline 115, a store scheduler 117, and a store pipeline 119. Memory system 140 includes a level one cache 142, a level two cache 143, and a level three memory sub-system 144. In various embodiments of memory system 140, level three memory sub-system 144 may include additional cache levels in addition to the main memory. In some processors, instruction decoder 111 may be part of another subsystem. Instruction decoder 111 decodes the program instructions and sends load transactions to load scheduler 113 and store transactions to store scheduler 117. Other types of instructions are sent to appropriate execution units, such as a floating point execution unit, or an integer execution unit. In most systems with multiple processing units, each processing unit includes a separate load/store unit. Store scheduler 117 schedules the store transactions and issue store transactions to store pipeline 119. Store pipeline 119 executes the store transactions, which typically stores data into memory system 140. Load scheduler 113 schedules the load transactions and issue load transactions to load pipeline 115 for execution. Load pipeline 115 executes the load transactions and reads the requested data from memory system 140.
Generally, load store unit 110 communicates directly with level one cache 142 and memory system 140 controls the data flow between level one cache 142, level two cache 143 and level three memory sub-system 144. Level one cache 142 and level two cache 143 are used to improve overall memory throughput of memory system 140. For example, level three memory sub system 144 would generally include a large memory unit that is typically made with high density memory devices that have slow access times. Level one cache 142 and level two cache 143 are made with faster memory devices that require larger area or are of greater cost than the high density memory devices used in level three memory sub-system 144.
When, load store unit 110 requests data at a location that is stored or “cached” in level one cache 142, i.e. a level one cache hit, or in level two cache 143, i.e. a level two cache hit, the data can be supplied to load store unit 110 very rapidly because access to high density memory devices is not required. Data in level one cache 142 would be available even faster than data in level two cache 143. In most embodiments of memory system 140, when load store unit 110 writes data to a memory location memory system 140, data can be written directly to level one cache 142 whether or not the memory location is currently cached in level one cache 142. Specifically, if the memory location is cached than the data is simply stored in the appropriate cache location. If the memory location is not cached, space in level one cache will be allocated for the memory location. Once data is written into level one cache 142, memory system 140 will eventually transfer the data to level two cache 143 and level three memory sub-system 144.
Generally, level one cache 142 has a first cache width (i.e. the size of a cache line) and level two cache 143 has a second cache width that is larger the first cache width of level one cache 142. The transfer of data from level one cache 142 to level two cache 143 and level three memory sub-system 144 greatly burdens the throughput of memory system 140. Hence there is a need for a method and system to improve the transfer of data between memory levels in a multi-level memory system.