1. Field of the Invention
The present invention generally relates to cache memory and verification and more particularly to a cache memory architecture for digital signal processors.
2. Background Description
State of the art Modems, such as those used in network servers or in home or laptop computers, require high performance signal processing, typically a digital signal processor (DSP). Consequently, these high performance DSPs require high performance static random access memory (SRAM). These SRAMs must have a very short single cycle access capability so that a given piece of code may be executed in a specific time period, i.e., meet a deadline. It is desirable that the duration of execution is predictable and consistent with every execution, i.e., is time invariant.
Typically, a worst case execution time of a particular piece of code must also have a known limit such that worst case time may be determined in advance. This property is referred to as determinism. Although achieving determinism (i.e., this time invariant behavior) is possible using single cycle SRAM, it is also very expensive. This is true, especially because SRAM is expensive compared to Dynamic Random Access Memory (DRAM).
Attempts to reduce modem memory cost by using a hierarchical memory, such as a caching slower, cheaper memories such as DRAM for these high performance DSPs, has resulted in processor stalls. Consequently, timing and frequent processor stalls are fundamental problems that must be addressed when trying to achieve time invariant real time system behavior using cheaper memory.
Further, if these processor stalls were not unpredictable, but occurred repeatably from one task load and execution to the next, then system programmers could plan for them. Then, code placement could be optimized in the system's effective address space in order to achieve repeatable behavior from one main memory task load to the next.
Unfortunately, typical state of the art modems have functions that each may include several tasks. These tasks are segmented into data and instruction task segments. With these typical state of the art modems, task execution and, therefore, system performance is dependant upon the alignment of task segments of a multi-segment task loaded into a conventional cache memory. Optimum performance is achieved when the segments of one task load and execution are aligned such that they are contained in and distributed throughout the cache, completely fitting in the cache.
Thus, for example, one task load and execution may load optimally into the cache in a first instance. However, in a second load, all segments might instead align in the bottom sets of the cache resulting in what is known in the art as thrashing. As a consequence of thrashing additional cycles are required to execute the same task or function.
In a real time system, these extra cycles may cause missed deadlines. Consequently, the missed deadlines make predicting individual task performance very complex and difficult, which is an intolerable situation.
Thus, there is a need for a low cost deterministic memory hierarchy and architecture.