1. Field of the Invention
The invention relates generally to computer systems, and more specifically to a system and method of caching that permits finely granular, programmable regions to be locked-down thus preventing eviction to provide, among other things, a fast memory scratchpad work area with general application while remaining cache regions provide standard caching functions.
2. Description of Related Art
A cache is a relatively small but fast buffer disposed near the processor for the purpose of reducing latency associated with processor accesses to relatively slow system memory. The cache "shadows" selective portions of the system memory containing temporally or spatially related information acted upon by the processor. Generally speaking, caches can be broadly categorized into two types, namely: a hardware managed array or a software managed array.
The hardware managed type can be generally characterized as an n-way set associative array (where n ranges from direct to fully associative) that replaces entries without any substantial interaction by the operating system or application program software, typically based on the least recently used ("LRU") or most recently used ("MRU") status of the entries. The software managed type typically employs a small random access memory (RAM) that is managed by the operating system or application program software for entry replacement--requiring specific knowledge of the behavior or flow of data or code stored in the cache. Most general purpose processors employ the first type while a large number of digital signal processors (DSPs) employ the second type.
The principal advantage of the first type over the second type is the independence of the cache line replacement policy from the program executed by the processor, particularly useful when executing large application programs or operating systems which tend to jump around to various blocks in address space. A drawback however, of the hardware managed array is that the cache line replacement technique, such as LRU, is not well attuned to certain real-time programming contexts in which, for example, a series of instructions are iteratively executed many times in a time-critical fashion.
The software managed array type of cache is appropriately used by most DSPs, since typically, execution of real-time programming requires program code to stay at or near the top of the memory hierarchy and hence resident in the fastest RAM. DSP-style caches don't suffer from the same "swap-out" problems of the first type cache, but are best applicable to constrained environments where the operating system is relatively simple and the software programs to be executed are known. This restriction arises since software has to explicitly manage the contents of the cache. Thus, the operating system and perhaps even the application program running on the DSP have to be customized for the particular system and software configuration.
An amalgam of both type of caches can be found, inter alia, in U.S. Pat. No. 5,493,667 to Huck et al. wherein an LRU replacement method and temporal lock-down technique are used to prohibit eviction or invalidation of cache entries in a portion of an instruction cache based on execution of a special "block" instruction. The temporal lock-down technique is intended to hold time critical instruction code (e.g. real-time repetitive/recursive routines) in the instruction cache irrespective of usage status such as LRU. More specifically, a special so-called "block" instruction is executed shortly after an initial execution of the time critical instruction code--freezing the replacement status (e.g. LRU) thus forcing subsequent cache replacements into cache entries other than those occupied by the time critical instruction routine. After the time critical routine is completed, a special so-called "unblock" instruction is executed to release the instruction cache for full general purpose utilization.
A drawback with this approach is the "coarseness" of the defined locked-down regions. For example, if the instruction cache is organized as two way set associative, executing the special block instruction would "block" (freeze) fifty percent of the instruction cache--leaving only the other fifty percent available for general instruction caching functions. Similarly, if the cache is organized as four way set associative, then executing the special block instruction would effectively convert the instruction cache to a three way set associative instruction cache, leaving only seventy five percent of the instruction cache available for general instruction cache purposes--regardless of the size of the time critical routine.
Another drawback with this lock down approach is that it is specific to locking most recent instruction cache entries and devoid of application to locking data in the instruction cache based on the address (as opposed to past temporal use) of that data.
It can be seen therefore, that there is a need for a cache with finely granular locked-down regions suitable for use with general operating system and application programs as well as to effectively accommodate repetitive (recursive) specialized programs.