Within a data processing system, when a master device wishes to perform read or write operations, the master device will typically issue an access request specifying a virtual address for the data item to be read or written. This virtual address then needs to be translated into a physical address within a memory device in order to identify the actual physical location in memory from which the data item is to be read or to which the data item is to be written.
There will typically be various components residing in the path between the master device and the memory device, for example various levels of cache, various interconnect structures, etc., and typically the address translation is performed by a memory management unit residing in close proximity to the master device along the path between the master device and the memory device.
Such a memory management unit (MMU) will typically include a translation lookaside buffer (TLB) structure for holding descriptor information obtained from page tables residing in the memory device, each descriptor providing information used to translate a portion of the virtual address to a corresponding portion of the physical address. If for a particular portion of a virtual address under consideration, there is no corresponding descriptor stored within the TLB, then page table walk circuitry within the MMU is typically used to perform a page table walk process in order to obtain the required descriptor from the memory device to enable the address translation process to be performed.
In association with a master device's MMU, it is known to implement prefetching mechanisms that seek to detect patterns between the various different access requests being issued by the master device, and based on those patterns to prefetch descriptor information into the TLB to thereby seek to avoid the latency/performance issues that occur when a descriptor is not available in the TLB for a future access request, and hence needs to be retrieved via the page table walk process. However, whilst such pattern recognition based prefetching mechanisms are useful, and can help to reduce latency, there are still other aspects of the virtual to physical address translation process that can introduce latency issues when seeking to process any individual access request.
In particular, considering an individual access request, a portion of the specified virtual address will typically be used in combination with a page table base address to identify a physical address for a descriptor that will be needed as part of the address translation process. At a minimum, once that descriptor has been obtained (via a page table walk process if necessary), then that descriptor will need to be used in combination with another portion of the virtual address to identify the actual physical address of the data item that is to be read or written. Accordingly, even in this simple case, there may be a need to access the memory device twice in order to process the read or write operation, once to retrieve the descriptor via a page table walk process, and once to actually access the data item.
In modern data processing systems, the number of accesses to memory that may be required when processing a single access request can increase significantly over the simple case referred to above. In particular, in modern data processing systems, where the size of the memory device is getting larger and larger, it is known to use multiple levels of page tables when performing the address translation process. In particular, at a first page table level, a portion of the virtual address may be combined with a page table base address to identify a physical address of a descriptor that is required as part of the address translation process. However, once that descriptor has been obtained, then that descriptor is used in combination with another portion of the virtual address to identify a descriptor in an additional page table at a further page table level. This process can be repeated multiple times before a final level of the page table hierarchy is reached, with the descriptor obtained from that final page table level then being combined with another virtual address portion in order to identify the physical address of the data item to be accessed.
Thus, it will be appreciated that even when considering a single access request, the address translation process may require the memory device to be accessed multiple times, and this can give rise to significant latency issues. Accordingly, it would be desirable to provide a mechanism that can alleviate the latency issues associated with the multiple stages of address translation required when processing each individual memory access request.