A central processing unit (CPU) needs to read data from a main memory during an operation, and because a read/write speed of the main memory is much less than an operating speed of the CPU, a processing capability of the CPU cannot be made full use of To mitigate a conflict that the operating speed of the CPU does not match the read/write speed of the main memory, a cache memory is generally disposed between the CPU and the main memory.
Data exchange between the cache memory and the main memory is performed in a unit of a cache line, where the cache line may also be referred to as a cache block. When reading data or an instruction, the CPU stores the data or instruction obtained by reading into a cache line. When the CPU needs to read the same or similar data for the second time, the CPU may obtain the data from the corresponding cache line. Because a speed at which the CPU accesses a cache is much greater than a speed at which the main memory is accessed, overall system performance is greatly improved.
The cache memory includes a tag random access memory (Tag RAM) and a data random access memory (Data RAM). The Tag RAM is configured to store an index address of a cache line in a cache, and the Data RAM is configured to store data of the cache line in the cache. A workflow in which the CPU accesses a cache is first accessing the Tag RAM to determine whether a cache line is in the cache; and if hit, directly obtaining data from the Data RAM, or if missed, obtaining data from the main memory.
Coherent Hub Interface (CHI) is a bus interconnection protocol used to connect multiple systems on chip (SoC), and is an extensible network structure. In the CHI protocol, statuses of a cache line are classified into five types, which are an invalid (I) state, a unique clean (UC) state, a unique dirty (UD) state, a shared clean (SC) state, and a shared dirty (SD) state. The I state is used to indicate that no data exists in the cache line. The UC state is used to indicate that the cache line exists only in one cache and the cache line includes clean data, where the clean data means that the data is not modified after being read from the main memory and remains consistent with the data in the main memory. The UD state is used to indicate that the cache line exists only in one cache and the cache line includes dirty data, where the dirty data means that the data is modified after being read from the main memory and is inconsistent with the data in the main memory. The SC state is used to indicate that the cache line exits in multiple caches cache and the cache line includes clean data. The SD state is used to indicate that the cache line exists in multiple caches and the cache line includes dirty data.
Further, the CHI protocol defines operations that a requesting party may perform on a cache line when the cache line is in the foregoing statuses. The requesting party is generally a cache at a specific level. That a cache line in a level 2 (L2) cache is accessed is used as an example, and these operations include the following steps.
(1) When the cache line in the L2 cache is in the I state, data of the cache line in the L2 cache cannot be accessed.
(2) When the cache line in the L2 cache is in the UC state, if the requesting party requests to access the data of the cache line in the L2 cache, the L2 cache may selectively return the data of the cache line to the requesting party, that is, may or may not return the data of the cache line to the requesting party.
(3) When the cache line in the L2 cache is in the UD state, if the requesting party requests to access the data of the cache line in the L2 cache, the L2 cache must return the data of the cache line to the requesting party.
(4) The data of the cache line in the SC or SD state cannot be modified, unless the cache line in the SC or SD state changes into another status, and according to a data consistency principle, data of a cache line in the SC or SD state in a cache at any level is the latest.
In a multi-core communications processing chip, a multi-level cache structure is generally used, that is, caches are classified into multiple levels. Typically, the caches are classified into three levels. FIG. 1 is a schematic structural diagram of three levels of caches. Access speeds of a level 1 cache (L1 cache), a level 2 cache (L2 cache), and a level 3 cache (L3 cache) decrease successively, and their capacities increase successively. The L1 cache includes an L1 cache 1, an L1 cache 2, an L1 cache 3, and an L1 cache 4, which may be separately accessed by four CPUs. The L2 cache includes an L2 cache A and an L2 cache B. The L1 cache is an upper level cache relative to the L2 cache, and the L2 cache is an upper level cache relative to the L3 cache.
In other approaches, multiple levels of caches with an exclusive structural design are provided. A feature of the multiple levels of caches is that there is no intersection between different levels of caches, that is, it is ensured that data of different cache lines is stored in two levels of caches, so as to prevent data of a same cache line from being stored in both the two levels of caches, thereby maximizing a cache capacity. Using the L2 cache and the L3 cache as an example, if data of a cache line is stored in the L2 cache, the data of the cache line is no longer stored in the L3 cache. In the multi-core communications processing chip, it is assumed that data of a cache line is stored in the L2 cache A and the L2 cache B needs to access the data of the cache line; after the L2 cache B sends a request to the L3 cache, because the data of the cache line is not stored in an L3 cache, the L3 cache needs to send a request for accessing the cache line to the L2 cache A, so as to obtain the data of the cache line from the L2 cache A.
However, according to the CHI protocol, the L2 cache A is allowed to skip returning data or return partial data after receiving an access request. For example, when a cache line in the L2 cache A is in the UC state, the L2 cache A may not return data of the cache line to a requesting party after receiving a request for accessing the cache line. Therefore, the L2 cache B needs to read the data of the cache line from the main memory, and a delay caused by reading the data from the main memory is significantly large. Therefore, based on the CHI protocol, a cache memory system with the exclusive structural design reduces cache system performance.
In the other approaches, multiple levels of caches with an inclusive structural design are further provided. A feature of the multiple levels of caches is that data, of all cache lines, stored in an upper level cache is backed up and stored in a lower level cache, so as to ensure that the data, of the cache lines, stored in the upper level cache has a backup in the lower level cache. Likewise, using the L2 cache and the L3 cache as an example, data, of a cache line, stored in the L2 cache is necessarily stored in the L3 cache. In the multi-core communications processing chip, it is assumed that data of a cache line is stored in the L2 cache A and the L2 cache B needs to access the data of the cache line; after the L2 cache B sends a request for accessing the cache line to the L3 cache, if the data of the cache line in the L3 cache is the latest, the cache line is directly read from the L3 cache without a need to read the data of the cache line from the L2 cache A; or if the data of the cache line in the L3 cache is not the latest, a request for accessing the cache line needs to be sent to the L2 cache A, so as to obtain the data of the cache line from the L2 cache A.
According to the CHI protocol, the multiple levels of caches with the inclusive structural design may ensure that the L2 cache B can obtain data of a cache line from the L3 cache if the L2 cache A does not return the data or returns partial data. However, an inclusive structure results in a significant capacity waste, and data, of all cache lines, stored in an upper level cache is also stored in a lower level cache. Especially when a quantity of CPU cores is large, a requirement for a capacity of the lower level cache is extremely high.