Technical Field
The present disclosure generally relates to resource management, and more specifically to methods and apparatus for auto-throttling encapsulated compute tasks.
Description of the Related Art
In highly parallel processor architectures, such as the architectures of typical graphics processing units (GPUs), a software driver and/or a hardware resource manager is responsible for allocating various processor resources to each of a plurality of threads executing in parallel on the processor. For example, a driver may allocate a portion of memory to each of hundreds of concurrent threads executing on the processor. Every thread must be allocated a certain amount of local memory. In processors that are capable of executing hundreds or thousands of threads concurrently, the amount of physical memory required for the driver to allocate memory to every thread may become very large.
One limitation of conventional memory allocation techniques is that the amount of physical memory available to the processor limits the number of threads that can be executed concurrently by the processor. For example, each thread is allocated the same amount of local memory, the size of the allocated space being determined by the worst case requirement of all the threads. Another limitation of conventional memory allocation techniques is that memory is used inefficiently. Some threads may utilize a large percentage of their allocated memory while other threads only utilize a small portion of their allocated memory.
Accordingly, what is needed in the art is a system and method for auto-throttling encapsulated compute tasks for efficient memory allocation in parallel processors.