1. Field of the Invention
The present invention generally relates to virtual memory management, and more particularly to a method and system for tracking accesses to virtual addresses in graphics contexts.
2. Description of the Related Art
Unless otherwise indicated herein, the approaches described in this section are not prior art to the claims in this application and are not admitted to be prior art by inclusion in this section.
Modern graphics processing units (GPUs) incorporate high speed processing units, such as shader engines and texture units, which are capable of performing multiple tasks on significant amounts of data in parallel. To access and operate on such data, some virtual memory management schemes have been developed for the GPUs to handle extensive memory accesses.
Traditionally, virtual memory management was implemented for central processing units (CPUs). With virtual memory management, a process can indirectly access physical pages, which store data in a physical memory, via “virtual” addresses. To effectively access the physical page, one memory management mechanism keeps and updates translations from virtual addresses, more precisely “virtual” page numbers derived from the virtual addresses, to physical addresses that point to physical pages in the physical memory. If a physical page corresponding to a virtual address does not currently reside in the physical memory, then the operating system performs the required operations to load the missing page from an auxiliary storage device (such as a hard disk) without needing to know which process requests to access the physical page. Though the aforementioned virtual memory mechanism has been extensively used in CPUs, it is not directly applicable to address some specific GPU's needs.
FIG. 1 illustrates one prior art virtual memory management approach for a GPU. Each task or process performed by the GPU corresponds to a graphics context. “Graphics context” as used herein means all the states, including memory states, needed for the GPU to perform one process. In this prior art implementation, “surface residency” model is adopted, which requires the physical presence of an entire surface, such as the texture surface 108, in the physical memory 106 before a graphics context is executed. With reference to FIG. 1, suppose the list of graphics contexts run by the GPU includes a first graphics context C1 and a second graphics context C2. In the first graphics context C1, for example, a texture mapping operation is applied to a first texture TEXTURE#1, while in the second graphics context C2, another texture mapping operation is applied to a second texture TEXTURE#2. The first graphics context C1 is further associated with a virtual memory space 102 in which certain virtual addresses are allocated for TEXTURE#1. Similarly, the second graphics context C2 is also associated with a virtual memory space 104 in which certain virtual addresses are allocated for TEXTURE#2. To access TEXTURE#1 or TEXTURE#2, the entire texture surface needs to be resident in the physical memory 106. If the surface is not resident in the physical memory, as shown for TEXTURE#2, a surface fault is generated, causing an interrupt to be generated. Then, the driver program along with the operating system usually take over and perform the necessary operations to swap in the missing surface.
The aforementioned surface fault model has a number of disadvantages. First, because it involves the residency of an entire surface, any time a surface fault occurs, the entire surface needs to be swapped in, which may cause undesirable thrashing due to the constant transferring of surfaces between the physical memory 106 and the auxiliary storage. Moreover, suppose only a particular portion of a texture is requested, such as portion 110. Under the surface fault model, memory locations sufficient to hold the entire texture surface in the physical memory 106 are still required to be allocated. Furthermore, referring again back to FIG. 1, with the concurrently operating graphics contexts C1 and C2, implementing a replacement policy for the physical memory 106 on the coarse granularity of a surface (e.g., which surface can be evicted) is likely problematic and inefficient.
As the foregoing illustrates, what is needed in the art is thus a mechanism that can track accesses to virtual addresses in graphics contexts at a finer granularity and address at least the problems set forth above.