Cache memory is conventionally used to speed up access to data from a main memory. Each cache memory location can be used as a placeholder for a selectable one of a number of main memory locations. When a main memory address is issued to access data, the cache memory determines the cache memory location that acts as placeholder for the main memory location that is addressed by the main memory address. This cache memory location is subsequently used instead of the main memory location.
The determination of the cache memory location from the main memory address will be called address mapping. The type of address mapping used in the cache memory is an important design parameter of the cache memory. Different types of address mapping are used in fully associative caches, set-associative caches and direct-mapped caches. The fully associative cache contains an address mapping unit that can map each main memory address (or memory block address) to any cache location (or block of locations). This requires an address-mapping unit that, given a main memory address, determines which, if any, of the cache locations is currently associated with that main memory address. For this purpose the mapping unit has to perform a costly associative memory function, which potentially slows down access speed.
In a direct mapped cache on the other hand, there is no choice as to which cache memory location corresponds to a main memory address: a given main memory address can map only to a single cache memory location. Thus an address-mapping unit in the direct cache can easily determine the required cache memory location. However, when the cache memory location is already in use for another main memory address, the data for that other main memory address will have to be removed from the cache memory, even if that data is still useful. This is a disadvantage compared with the fully associative cache, which can store data for any combination of main memory addresses.
A set-associative cache is a compromise between the direct mapped cache and the fully associative cache. The set-associative cache uses sets of cache memory locations (a set being smaller than the whole cache memory). Each memory address corresponds to a set. The main memory address can be associated with any cache location in its corresponding set of cache memory locations. Thus, on one hand it is easier to determine a cache memory location from a main memory address, because the associative memory function has to be performed only within the set corresponding to that main memory address. On the other hand, data has to be removed from the cache less often than in a direct mapped cache, because there is a choice of memory locations within the set, so that data for combinations of main memory addresses can be stored in the set.
The fully associative cache, a set-associative cache and the direct mapped cache differ in the number of alternative cache memory locations that can be used for the same memory address. This number is a fixed design parameter and it is successively smaller for fully associative caches, set-associative caches and direct-mapped caches. The larger this number, the less frequently it will be necessary to remove useful data from the cache to make room for new data. However, making this number larger also increases the complexity of the cache and it may decrease its access speed. Thus, the choice of this number represents an important tradeoff in cache design.
Amongst others, it is an object of the invention to improve the efficiency of address mapping in a cache memory.
Amongst others, it is another object of the invention to allow for better optimization in the trade-off involved in choosing the number of alternative cache memory locations that can be used for the same memory address.
Amongst others, it is another object of the invention to allow for real-time use of a cache memory without blocking other use. In particular, it is a further object of the invention that general purpose caching schemes can be combined with stream caching mechanisms in the same cache memory, with a guarantee of real-time behavior of stream caching.
The data processor according to the invention is described in claim 1. As in set-associative mapping, this data processor limits the number of associations between main memory addresses and cache memory locations that need to be consulted to determine a cache memory location that is to be accessed. The search for an association is limited to a group of associations, as it is limited to a set in set-associative mapping. Within each group associative mapping is possible to any cache memory location assigned to the group. But in contrast to set-associative mapping, the cache memory locations are dynamically assigned to the groups. Thus, the number of assigned memory locations may can be made dependent on the needs of the program or program part that is executed. This number may vary at least between a zero and a non-zero number of cache memory locations.
By varying the size of the groups, the mapping can be adapted to the needs of the program without an overhead in cache memory locations. The program executed by the processor selects dynamically (that is, during execution of a program) which and how many memory addresses are involved in associative relations of each group. Thus, a high cache performance is can be guaranteed for those memory addresses. For example if the number of main memory addresses that can be associated simultaneously in one group is equal to the number of addresses needed for a certain real time task, real time response of the cache can be guaranteed. During one program (part) main memory addresses can be mapped to certain cache memory locations in one group and in another program main memory addresses can be mapped to those cache memory addresses in another group.
In an embodiment of the processor according to the invention the processor is arranged to prefetch or write one or more streams of data from iteratively computed main memory addresses (separated for example by a fixed address step size). At any point in time only a window of the addresses in the stream that have been used before will be used later as well. The addresses for these streams are mapped using the first group, whereas other addresses are mapped with other groups. By assigning a number of cache memory locations to the first group that is sufficient to retain associations for all the addresses in the window, it is possible to eliminate the need for to replacement data from the streams that can be reused. The remaining groups may use set associative mapping, i.e. the main memory address may be mapped directly on one of the remaining groups.
When the processor prefetches or writes multiple streams of data items from iteratively computed main memory addresses, each stream may have its own group of associations of addresses from the stream with cache memory locations assigned to that group. The remaining cache memory locations are accessed with set associative mapping. Thus, cache memory locations can be assigned to different streams on an xe2x80x9cas neededxe2x80x9d basis and the remaining cache memory locations can be used for non-stream addresses.
The group to be used for a memory access instruction can be selected on an instruction by instruction basis, the instructions indicating explicitly what type of address mapping should be used to access their own operands or results, i.e. by indicating a stream. Alternatively, an instruction may indicate the type of address mapping for use by other instructions, for example by specifying a range or set of main memory addresses for which a specific type of address mapping is to be used. The former is more flexible; the latter requires less instruction overhead.
Cache memory locations used for one group will generally not be available for use in another group. When set associative mapping is used for main memory addresses that do not belong to any stream, removal of a cache address from a set reduces the effectivity of that set. Preferably, the cache memory addresses that are used for a group that supports a stream are selected evenly from different sets used in set associative mapping. Thus, it is avoided that for some sets the number of available alternative cache memory addresses is much smaller than the number of such available alternative cache memory addresses for other sets. This minimizes the need for cache replacement using set-associative mapping.
In another embodiment of the processor according to the invention separate cache replacement decisions are taken for the cache memory locations involved in different groups. Replacement is needed if a new main memory address has to be mapped to a cache memory location, but no free cache memory location is available. In this cache a cache memory location must be selected for reuse by the new main memory address, at the expense of the old main memory address that maps to the cache memory location. A well known strategy is to reuse the cache memory location that has been least recently used.