1. Field of the Invention
The present invention relates to a tag configuration of a system in which a processor bus has a plurality of CPU cores and a system controller has a snoop tag as a copy of cache in a CPU.
2. Description of the Related Art
Generally, cache memory is used as means for improving the throughput in accessing main storage slower than a CPU. The cache memory is normally located between the CPU and the main storage, and is normally provided in the CPU.
When the cache memory is compared with a storage device (external memory) used in the main storage, the cache memory is higher in access speed, but has a smaller capacity. Therefore, the data stored in the cache memory is a part of all data held in the external memory.
If the cache memory stores data to be read, the data can be read at a high speed. However, unless there is data to be read in the cache memory, the data to be read is read by accessing the main storage, and the data is read at a lower speed.
The cache memory is configured by cache (data area, or cache data) storing a part of data stored in the main storage and tag memory (tag area, or cache tag) storing a part (tag) of the address of the data stored in the cache.
The processor is informed whether or not data required for execution is stored in the cache by comparing the address of the data with the tag in the tag memory. Unless necessary data is stored in the cache, the data is loaded into the cache from the main memory, and a part of the address of the data is loaded as a new tag into the tag memory. To load the new tag into the tag memory, it is necessary to expel the tag which is not required or is considered unnecessary.
The tag memory is formed by a plurality of ways (WAYs). To determine the tags to be expelled (to update the tags) before loading a new tag, the tag is assigned LRU (least recently used) information in advance.
For example, when a tag is written, an LRU information generation circuit generates the LRU information indicating the tag to be next replaced, and the information is written corresponding to the tag to the tag memory. Therefore, a line as a unit of a read/write to the tag memory stores a tag and its LRU information. When a new tag is loaded into the tag memory, a supplemented WAY information generation circuit determines which tag is to be supplemented to which way using the LRU information added to each tag.
A match determination circuit compares (a part of) the address of the data required by the processor with a predetermined tag stored in the tag memory, and determines whether or not the data is stored in the cache.
After the determination, (the line of) the tag read from the tag memory is written again to the tag memory. That is, the tag rewriting cycle is executed after the determination because it cannot be stated that the LRU information is correct LRU information reflecting the result of the determination unless the LRU information about the line of the tag is rewritten to the tag memory. Then, based on the result of the hit/mishit of the line, the LRU information generation circuit generates new LRU information and stores it in the tag memory. Thus, the LRU control of writing a new tag using the LRU information is performed.
FIG. 1 shows the entire information processing system (chip set) provided with a common multiprocessor configuration. In FIG. 1, the information processing system mainly includes a system board 10, an input/output control unit 15, a data cross bar 17, and an address cross bar 16. The system board 10 includes a system controller 1, a firmware hub 11, a CPU 2, a memory controller 12, memory 13, a CPU bus 6, and a firmware hub bus 7.
The CPU bus 6 connects the system controller 1 to the CPU 2. The firmware hub bus 7 connects the system controller 1 to the firmware hub bus 7. The data cross bar 17 is a bus for transmitting data to or receiving data from the system board 10. The address cross bar 16 is a bus for transmitting an address to or receiving an address from the system board 10.
The system controller 1 is a device for controlling transmission/reception of data between the CPU 2 and the memory 13. The firmware hub 11 stores firmware. The memory controller 12 controls the operation of the memory 13.
FIG. 2 shows the tag in the information processing system. In FIG. 2, the system controller 1 is connected to the CPUs 2a, 2b, 2c, and 2d via the CPU buses 6a, 6b, 6c, and 6d. Each CPU 2 (2a, 2b, 2c, and 2d) is provided with cache memory. The cache memory of each CPU 2 is configured by a cache tag 3 (3a, 3b, 3c, and 3d) and cache data 4 (4a, 4b, 4c, and 4d).
The system controller 1 is provided with a snoop tag 5 (5a, 5b, 5c, and 5d) corresponding to each cache tag 3 (3a, 3b, 3c, and 3d).
The system controller 1 confirms whether or not the data as a target of a read request is held by other snoop tags snoop tags 5b, 5c, and 5d, for example, if the CPU 2a issues the read request when a cache miss occurs. When the data as a target of the read request is held by other snoop tags snoop tags 5b, 5c, and 5d, the system controller 1 acquires replacement information corresponding to the replace request from the snoop tag 5 holding the tag and passes it to the CPU 2a. However, if no snoop tag 5 holds the data as a target of the read request, the system controller 1 acquires the replacement information corresponding to the read request from the main storage, and passes it to the CPU 2a. 
FIG. 3 shows an example of the configuration of a conventional tag. Mainly the CPU 2a shown in FIG. 3 is explained below. The cache tag 3a in the CPU 2a is formed by, for example, four WAYs (3a-0, 3a-1, 3a-2, and 3a-3).
On the other hand, the snoop tag 5a of the system controller 1 corresponding to the cache tag 3a is also formed by four WAYs (5a-0, 5a-1, 5a-2, and 5a-3). Thus, in the conventional configuration, the cache tag 3 of the CPU 2 has the same number of WAYs as the snoop tag 5 of the system controller 1. In this case, the following event can occur.
First, if the cache tag 3 is full (that is, all WAYs forming the cache tag are in the write state) when a read request is issued (6a-Aa) after a cache miss from the CPU 2 to the system controller 1, then the CPU 2 expels the address from any of the WAY forming cache tag 3, and replaces it with the address of data as a target of a read request (replacement information).
In the above-mentioned system, there is no means for notifying the system controller 1 from the CPU 2 of the replacement information about the address of which WAY is expelled, or the replacement information is not always notified even granting that there is the means.
Therefore, in the system having no means of notifying the system controller 1 from the CPU 2 of the replacement information or having the system provided with a protocol in which delete information is not always given without fail although there is such means, the system controller 1 uniquely determines a snoop tag to be replaced.
FIGS. 4A and 4B show an update example of a tag when a cache miss occurs in the conventional technology. Each of the four WAYs (3a-0, 3a-1, 3a-2, and 3a-3) forming the cache tag 3a in the CPU 2a stores an address (0, 1, 2, and 3) (the cache tag 3a in the CPU 2a is in the full state).
First, a read request is issued in the CPU 2a after a cache miss. For example, a read request for an address 5 is issued (step 101. Hereinafter, a “step” is referred to as “S”).
Then, the CPU 2a determines, for example, the WAY[3a-0] of the cache tag 3a to be replaced, and deletes the address information (address 0) stored in the WAY [3a-0] (S102).
The CPU 2a issues to the system controller 1 a read request for the address information (address 5) generated in S101 (S103).
In this example, since addresses are written to all WAYs of the snoop tag 5a (full state), it is necessary to delete any WAY as a target to be replaced.
However, in the system receiving no replace request from the CPU 2a, it is necessary for the system controller 1 to forcibly determine the address of which WAY of the snoop tag is to be replaced.
Then, the system controller 1 determines, for example, the WAY [5a-2] to be replaced, and deletes the address information (address 2) stored in the WAY [5a-2].
The address stored in the WAY [5a-2] to be replaced is also a target to be replaced in the cache tag 3a in the CPU 2a. Therefore, the system controller 1 requests (eviction request) the CPU 2a to delete the address (address 2) to be replaced (S104).
Upon receipt of the eviction request, the CPU 2a analyzes the command, and deletes the address information (address 2) about the WAY [3a-2] of the cache tag 3a (S105).
The system controller 1 updates the WAY [5a-2] of the snoop tag 5a (the address 5 is stored in the WAY [5a-2]). Then, the system controller 1 transmits the data corresponding to the address information to the CPU 2a (S106).
The CPU 2a receives the data from the system controller 1, writes the data to the cache data 4a, and updates the WAY [3a-0] of the cache tag 3a by the address corresponding to the data (S107).
A read request is issued after a cache miss in the CPU 2a again. For example, a read request for an address 6 is issued (S108). At this time, since there is available space in the cache tag 3 (WAY [3a-2]), the deleting process described in S102 is not performed, but the CPU 2a issues a read request for the address 6 to the system controller 1 (S109).
Then, since the snoop tag 5a is full, the system controller 1 deletes the address information (address 1) stored in the WAY [5a-1] when, for example, the WAY [5a-1] is determined as a replacement target.
The address stored in the WAY [5a-1] to be replaced is also a replacement target in the cache tag 3a. Therefore, the system controller 1 requests the CPU 2a to delete the address (address 1) to be replaced (eviction request) (S110).
Upon receipt of the eviction request, the CPU 2a analyzes the command, and deletes the address information (address 1) stored in the WAY [3a-1] of the cache tag 3a (S111).
The system controller 1 updates the WAY [5a-1] of the snoop tag 5a (the address 6 is stored in the WAY [5a-1]). Then, the system controller 1 transmits the data corresponding to the address to the CPU 2a (S112).
The CPU 2a receives the data from the system controller 1, writes it to the cache data 4a, and updates the WAY [3a-0] of the cache tag 3a by the address corresponding to the data (S113).
Thus, with the conventional configuration, the replacement target of the CPU 2 does not always match that of the system controller 1. Accordingly, the CPU 2 allows excess replacement. As a result, there can be a plurality of available WAYs, thereby possibly raising a cache miss rate.
Meanwhile, the documents relating to the related art in this technical field can be the Japanese Published Patent Application No. H5-204869, the Japanese Published Patent Application No. H7-311711, the Japanese Published Patent Application No. H10-214222, and the Japanese Published Patent Application No. H5-265970.