The present invention relates to an architecture for a set-associative cache. In particular, the invention is directed to a cache which is selectively configurable as either a unified cache or a split cache.
FIG. 1 shows schematically a conventional configurable-architecture cache 10. The cache can either be configured as a single (unified) cache 12 for treating both data and instructions in the same cache area (FIG. 1a), or the cache can be configured as two (split) caches 14 and 16 for treating data and instructions in different cache areas (FIG. 1b). The optimum configuration depends on the way in which data and instructions are organised, in particular on the instruction code structure and the data accesses a program performs. Since this is software specific, the cache is controllable so that either the unified mode or the split mode can be selected by the software, by writing a control value to an appropriate control register in the cache.
Caches may also be classified into various types according to their address mapping. In an associative cache, there are a plurality of internal addresses in the cache""s memory which can be accessed to map to an external address. In a fully associative cache, data from any external address can be stored at any location within the cache""s memory. While a fully associative cache could provide best cache performance, it involves huge amounts of control logic, and results in increased power consumption.
A direct mapped cache uses a fixed address mapping scheme, such that each external address is mapped to a fixed internal address in the cache""s memory. Since the cache memory is typically several orders of magnitude smaller than the overall external address range, certain bit positions in the external address are normally selected to define the mapped address in the cache memory. External addresses which have the same bits in the selected bit positions therefore map to the same internal address, and form a so-called addressing xe2x80x9csetxe2x80x9d in the cache. A direct mapped cache is relatively easy to implement with low gate count, and has only a small power consumption. However, the cache performance is lower, since subsequent accesses to the memory locations which map onto the same set will always overwrite currently buffered data.
A so-called set-associative cache combines elements of association and direct mapping, and is often used as a compromise between the amount of control logic and the power consumption on the one hand, and cache performance on the other. In a set-associative cache, direct mapping is used so that external addresses map to a set according to certain bits of the address. However, within each set, there are a plurality of possible internal addresses (or xe2x80x9cwaysxe2x80x9d) which can be used for the external address. The particular way to be allocated for an external address depends on whether any ways in that set are currently unallocated; if not, then a replacement method is used to select which currently allocated way is to be overwritten (i.e., newly allocated).
FIG. 2 illustrates schematically an address area 20 of the cache memory divided into xe2x80x9cnxe2x80x9d sets 22, each set including a plurality of ways 24 (0 . . . k) for storing data mapped to that set. Each way 24 is defined as a cache line 26 for grouping a plurality of words 28 of bytes, so that each cache line 26 actually maps to a plurality of consecutive external address locations.
FIG. 3 shows how an external address 30 location is decoded to map a byte represented by the external address to the cache memory. The external address 30 has a width of b+w+s+t bits. From the address, certain bits 32 (s bits) define the set to which the external address is fixably mapped. The least significant bits 34 are used as an index to define the location of the byte in a cache line 26 of the set. The least significant bits 34 are divided into two groups 36 (w bits) and 38 (b bits), the bits 36 representing the location in the cache line of a word containing the byte, and the bits 38 representing the location of the byte within that word. The most significant bits 40 (t bits) are not used to map the external address, but instead are saved as a tag 42 (FIG. 2) associated with the cache line 26, so that the full addresses represented by each cache line are known. Referring to FIG. 2, each cache line 26 also includes valid (or xe2x80x9cvalidationxe2x80x9d) bits 44 for indicating whether the words 28 in the cache line actually contain valid data.
When a set-associative cache is used in a configurable unified/split mode architecture, a conventional approach for implementing the split mode is to split the sets into two groups. Typically, half the sets are used for the data cache area, and half the sets are used for the instruction or code cache area. For example, in FIG. 2, the sets 0 . . . ((n/2xe2x88x921) would be used to define a data area 46, and the other sets n/2 . . . nxe2x88x921 would be used to define an instruction or code area 48.
Although this seems an eminently logical approach, a resulting complication is that the number of available sets to which an external address is mapped varies in dependence on the operating mode. In the unified mode, then the address is mapped to n sets. In the split mode, the same address range (assuming that both data and instructions can lie anywhere in the address range) has to be mapped to only n/2 sets. FIG. 4 illustrates how the address range is mapped in the case of a split mode. It can be seen that since the number of available sets is reduced to only half, the number of bits s"" to define the set 32 is reduced by one bit (sxe2x80x2=sxe2x88x921). Similarly, the number of bits txe2x80x2to define the tag 40 has to be increased by one bit (txe2x80x2=t+1), in order to accommodate the same address range. This results in variable length set and tag fields 32 and 40, depending on whether the split or unified cache mode is selected. Additional logic is therefore required to handle the variable length fields, such as that illustrated schematically in FIG. 5.
Referring to FIG. 5, the additional logic overhead consists of a first multiplexer 50, a second multiplexer 52 and a gate 54, for each way 24 defined in a set, in order to decode (map) an external address in either the split mode or the unified mode. Essentially, the multiplexers 50 and 52 and the gate 54 are all required to accommodate one bit 58 of the address which may either be part of the set field 32 or part of the tag field 40, depending on whether the cache is operating in its unified or split mode.
A further disadvantage is that it is not possible to dynamically switch the cache between its unified and split modes while in use, because the address mapping is different in either mode. Therefore, if a switch is required, it is necessary to flush the entire contents of the cache, since data mapped in the cache in one mode is not compatible with the other mode.
A yet further disadvantage is that the tag memory is not used efficiently in this implementation as one bit remains unused in unified mode. The tag memory is memory which is reserved for storing the tag information, and each memory location has to be sufficiently long to accommodate the largest field, even though this only occurs in the split mode.
The present invention concerns a set-associative cache having a selectively configurable split/unified mode. The cache may comprise a memory and control logic. The memory may be configured for storing data buffered by the cache. The control logic may be configured for controlling the writing and reading of data to and from the memory. The control logic may organise the memory as a plurality of storage sets, each set being mapped to a respective plurality of external addresses such that data from any of said respective external addresses maps to that set. The control logic may comprise allocation logic for associating a plurality of ways uniquely with each set, the plurality of ways representing respective plural locations for storing data mapped to that set. In the unified mode, the control logic may assign a first plurality of ways to each set to define a single cache region. In the split mode, the control logic may partition the first plurality of ways to define a first and a second sub-group of ways assigned to each set, to define a respective first and second cache region.
The objects, features and advantages of the invention include providing a set-associative, configurable split/unified mode, cache that may (i) use the same number of sets to map an external address range irrespective of the split or unified mode, (ii) simplify the decoding logic required to decode an external address in either the split or unified mode, (iii) enable dynamic switching of the cache between the split and unified modes while preserving the cached contents (i.e. without having to flush the cache); and/or (iv) avoid redundancy in the tag memory