The x86 architecture, like most microprocessor architectures, provides a means for a programmer to specify debug breakpoints on an access to one or more address ranges. In particular, the breakpoint address ranges are virtual address ranges (referred to as linear addresses in x86 parlance). In some processors, the load unit performs the check to determine whether a load address accesses a breakpoint range.
Sometimes a load spans two cache lines, so it must be broken up into two pieces, as shown in FIG. 4, such that each piece is sent down the load pipeline to access the data cache. The first piece is sent down the load pipeline to access the data cache with the initial load address and a first size (which is the number of bytes implicated by the first piece) in order to obtain the data from the first implicated cache line, and subsequently the second piece is sent down the load pipeline to access the data cache with an incremented version of the initial load address and a second size (which is the number of bytes implicated by the second piece) to obtain the data from the second implicated cache line. This case makes the breakpoint check more complex, as discussed in more detail below, because the load unit must check each piece against the breakpoint ranges.
Some further background information is helpful. Each load queue entry includes storage space for an address field. Initially, the load unit loads the load virtual address into the address field. The load unit subsequently translates the load virtual address into a load physical address (in order to access the data cache) and subsequently replaces the virtual address with the physical address in the address field. Having a single address field in each load queue entry minimizes the storage requirements of the load queue and therefore saves die real estate space and power consumption. However, the single address field causes a problem in the cache-line-spanning/two-piece load case because when the load unit pipeline processes the first piece it clobbers the virtual address such that the second piece no longer has the virtual address to perform the breakpoint checking.
One solution to the problem is to include space in each load queue entry for two addresses. That way, the physical address can be written to the second address field to avoid clobbering the virtual address, or each address field can be associated with a different piece so that each piece has its own virtual address when it needs it and can clobber its own virtual address without affecting the other piece. However, this solution is undesirable because the additional storage space associated with the second address field consumes a significant additional amount of die real estate and power.
Another solution that avoids the additional address storage space is to perform additional passes through the load pipeline. That is, the first piece is sent down the load pipeline to perform the breakpoint checking, then the second piece is sent down the load pipeline to perform the breakpoint checking, then the first piece is sent down the load pipeline to generate the physical address and access the cache, then the second piece is sent down the load pipeline to generate the physical address and access the cache. This solution is undesirable because it is slower.