Field of the Invention
Embodiments of the present invention relate generally to computer processing and, more specifically, to an approach to adaptive allocation of shared resources in computer systems.
Description of the Related Art
Computer systems in general, and graphics processing units (GPUs) in particular, often include multiple clients that operate in parallel. The clients could be hardware clients such as processing engines, or software clients such as parallel threads, among other possibilities. Generally, parallel clients rely on shared resources that the computer system provides. Some examples of shared resources include memory, interconnect, cache bandwidth, and memory bandwidth. Clients typically must compete for access to shared resources, and, thus, the performance of each client and the overall computing system depends on the availability of those shared resources.
For example, in a computer system where access to a shared memory is limited, the performance of clients within that computer system could depend on the ability of those clients to read data from and write data to the shared memory. If a first client issues a read request to the shared memory, then the shared memory could be occupied for a period of time servicing the read request. If a second client issues a write request while the shared memory is busy servicing the read request, then the second client simply must wait until the shared memory is finished servicing the read request before the write request can be serviced. Consequently, the second client may stall, which would affect the performance of that client. As a general matter, if multiple clients issue access requests to a shared resource concurrently, in many practical cases, the shared resource can only process those requests sequentially, potentially causing those clients to underperform.
One problem with the approach described above is that clients have many different types of behaviors that oftentimes serve to monopolize a shared resource. For example, an “aggressive” client could issue a large number of access request to the shared resource in a short amount of time, thereby “hogging” the shared resource (as is known in the art). A less aggressive client could only issue a few access requests, and would thus be forced to wait for quite some time. If the less aggressive client was latency-sensitive, then that client could crash altogether due to the behavior of the aggressive client. In short, conventional computer systems make no provisions for allocating shared resources based on the requirements of the different clients that are active within the computer system.
As the foregoing illustrates, what is needed in the art is a more effective approach to allocating shared resources in computer systems that institutes more fairness across clients.