1. Field
Embodiments relate to multi-core processors. In particular, embodiments relate to maintaining data coherence in multi-core processors.
2. Background Information
Chip multi-processors (CMPs), multi-core devices, and other multi-processor apparatus have a number of cores or processors on a single integrated circuit die or chip. Each core generally has associated therewith one or more corresponding local caches which are operable to cache copies of data from one or more shared memories. The cores are generally coupled together and are operable to share the data stored in their local caches with one another.
It is generally important to maintain coherence, or a consistent view of the data, across all of the cores. All-core sharing map-based hardware coherence directories are one of the commonly used hardware-based coherence mechanisms in present day general-purpose processors to help maintain coherence of data across all of the cores. These directories represent hardware structures that are operable to track data cached in the local cache(s) of all of the cores, as well as which of the cores are sharing the data. All-core hardware coherence tags are typically stored in the entries of the directories and indicate the sharing of the data.
FIG. 1 is a block diagram of a known all-core hardware coherence tag 100. As the name implies, the all-core hardware coherence tag has a scope of all of the cores and is operable to indicate sharing of data among any or all of the cores. The all-core hardware coherence tag includes an address field 102, a state field 104, and an all-core sharing map field 106. The address field may indicate an address (e.g., of a cache line caching a copy of data from memory and/or the memory address of the data). By way of example, the address field may have a length of 33-bits. The state field may indicate a state of the corresponding data or entry in the directory (e.g., whether the data or entry is modified, exclusive, shared or invalid). For example, the state field may have a length of 2-bits. The 2-bits may indicate any of four different states.
The all-core sharing map field 106 may indicate which of the cores of a device are caching a copy of the data corresponding to the address field as well. The all-core sharing map field generally includes 1-bit for each of the cores. As shown in the illustration, the all-core sharing map field has a length of 32-bits or 1-bit for each of 32-cores. The 1-bit corresponding to a given core is operable to indicate whether or not the given core is caching a copy of the data. According to one possible convention, a binary value of 1 (i.e., the bit being set) may be used to indicate that the given core is caching a copy of the data, whereas a binary value of 0 (i.e., the bit being cleared) may be used to indicate that the given core is not caching a copy of the data. For example, in the illustrated embodiment, bits [0:5] having the respective values 0 1 1 0 0 1 may indicate that, for the said address, core 0 is not caching, cores 1 and 2 are caching, cores 3 and 4 are not caching, and core 5 is caching.
FIG. 2 is a block diagram of a known all-core sharing map-based hardware coherence directory 210. The directory is set associative and includes a 4-way set associative tag array 212 and a 4-way set associative all-core sharing map array 214. There is a one-to-one correspondence between ways in the tag and cluster sharing map arrays. The tag array 212 is arranged as (k+1)-sets, labeled set[0] thorough set[k], and four ways, labeled way[0] through way[3]. The address and state fields are typically included in the tag array. As shown, set[1] includes address 102 and state 104 fields in each of way[1] and way[2]. The all-core sharing map array 214 is also arranged as (k+1)-sets, labeled set[0] thorough set[k], and four ways, labeled way[0] through way[3]. The all-core sharing map fields are typically included in the all-core sharing map array. As shown, set[1] includes all-core sharing map fields 106 in each of way[1] and way[2]. Typically, the number of tags in the directory equals the total number of tags in local/private caches of all cores to enable tracking distinct cache lines.
During operation, when it is desired to know which cores are caching data for a given address, the all-core sharing map-based hardware coherence directory may be consulted. The directory includes tag comparison logic 216. The tag comparison logic may compare four addresses, each stored within a different one of the four ways of a set, with a given address. The four addresses may be read out on tag array readout lines 218. Either none of the four addresses may match the given address, or at most a single address in a single way may match the given address. Assuming single address in a single way matches the given address, a way select signal 220, for example a 2-bit way select signal for a 4-way set associative array, may be output from the tag comparison logic to way selection logic 222. The way select signal may indicate the single way having the matching address. Four all-core sharing map fields, each in one of four different ways of the corresponding set, may be readout of all-core sharing map array readout lines 224 and provided to the way selection logic. The way selection logic may select the single all-core sharing map field on the single way indicated by the way select signal. For example, if the way select signal indicates way[2] (e.g., has a value of binary 10), then the all-core sharing map field in way[2] may be selected and output as a selected all-core sharing map 206. The output all-core sharing map field indicates which of the cores are sharing the data.