With expansion of a data set (that is, a set of data) and an increase in a quantity of processor cores, a translation lookaside buffer (TLB) and a cache are facing an increasingly severe challenge.
A problem of contention for a TLB is that in most computer architectures, a page table needs to be first queried during memory access performed each time, to translate a virtual address (VA) into a physical address (PA), and then the PA is used as an index to search a Cache in order to find data, in the Cache, needing to be obtained for the memory access. A page table generally has a quite large memory size and is stored in a memory in a tiered manner. The TLB serves as a buffer of a page table to temporarily store a few frequently used page table entries stored at a location quite near a central processing unit (CPU) core. In this way, a process of translation between a VA and a PA can be greatly accelerated if a mapping relationship to be queried between the VA and the PA is stored in the TLB, that is, a TLB access hit occurs. However, the memory still needs to be searched in a tiered manner for a page table to obtain a corresponding page table entry if a TLB access miss often occurs, which leads to a long access delay. With increasing expansion of a data set in a big data era, such contention for a TLB only becomes increasingly fierce, which causes more TLB access misses, and severely affects performance.
A problem of contention for a Cache is that in most multi-core architectures, a last level Cache (LLC) is shared by multiple cores, which causes LLC contention between cores and causes cache replacement between processes, thereby reducing cache utilization. Particularly, some programs have relatively poor locality, but frequent accesses occur and there is a quite large working set, which results in that a relatively high capacity of the LLC is occupied in order to seriously affect performance of other processes. As a quantity of cores increases, a problem of contention for an LLC is increasingly serious.
In the prior art, a huge page technology and a page-coloring based Cache partition technology are generally used to optimize performance. A quantity of page table entries required by a process is a size of a working set (a working set is a memory required by a process during a specific period) or a memory size of a page. The working set of the process is increasingly expanded according to an application requirement. In this case, a quantity of page tables required by the process can be remarkably reduced by increasing a memory size of a page. For example, a memory size of an ordinary page is 4 kilobytes (KB), and a quantity of page tables required by a process can be reduced by 512 times using a huge page whose memory size is 2 megabytes (MB), which greatly relieves TLB contention pressure and reduces TLB Misses in order to improve performance.
A method for reducing contention for an LLC includes allocating an LLC to different cores or processes statically or dynamically, to isolate the cores or processes from each other without causing contention. This technology is referred to as cache partition. Page-coloring is a method for implementing cache partition by means of software, which has advantages of being easy to use and requiring no hardware modification. FIG. 1 is a schematic diagram of a principle of page-coloring based Cache partition. As shown in FIG. 1, FIG. 1 shows the principle of page-coloring based Cache partition. From a perspective of an operating system, a PA may be divided into two parts, a physical page number (PPN) and a page offset. From a perspective of a Cache, a PA may be divided into three parts, a cache tag, a cache set index, and a cache block offset. The operating system can control a PPN, but cannot control a page offset. Assuming that a quantity of bits of a page offset is N, a memory size of a page is 2N. An intersection between a PPN and a cache set index is referred to as a color bit. The operating system can map an address to a specified cache set by controlling a PPN (that is, controlling a color bit). In this way, different color bits are allocated to different processes, that is, addresses can be mapped to different cache sets in order to implement mutual isolation.
FIG. 2 is a schematic diagram of a contradiction between a huge page technology and page-coloring based Cache partition technology. As shown in FIG. 2, there are a higher quantity of bits of a page offset area in a huge page (because a page has a larger memory size, more bits are required to indicate a page offset), while there are a lower quantity of bits of a PPN area, and the PPN area does not intersect with a cache set index any more. There is no color bit so that an operating system cannot control a cache set index any more in a manner of controlling a PPN. Therefore, in an existing hardware architecture, a contradiction between a huge page technology and a page-coloring based Cache partition technology exists, which results in that these two technologies cannot be used at the same time.