A graphical processing unit (GPU) will generally support numerous functions simultaneously. When multiple autonomous engines are supported inside the GPU, the autonomous engines are required to operate through a common memory interface to access system memory. This requirement results from a GPU having a memory interface that is shared for all of the GPU needs.
The number of autonomous engines for a given GPU may differ depending on the overall needs for the GPU. The memory interface architecture of the GPU thus may be required to be scalable, to add or remove support for engines, while addressing the requirements of guaranteeing quality of service (QoS) for the engines and controlling hardware growth as new engines are added.
However, conventional solutions have addressed scaling of the memory interface either by maintaining the same resources at the cost of reduced QoS (such as one engine's data traffic affecting the data traffic of another engine), or have replicated the structures in the memory interface for each supported engine to maintain QoS at the cost of greatly increased hardware area.