In current graphics systems, the number and processing speed of memory clients have increased enough to make memory access latency a barrier to achieving high performance. FIG. 1 illustrates a Graphics Processing Unit (“GPU”) 100 of the prior art. The GPU 100 includes a graphics pipeline 102, which includes a set of memory clients 104, 106, 108, and 110. The graphics pipeline 102 is connected to a memory controller 112, which serves as an interface between the set of memory clients 104, 106, 108, and 110 and a memory (not illustrated). When a particular memory client makes a request for data, the memory controller 112 retrieves the data from the memory for that memory client. When multiple memory clients make requests for data substantially simultaneously, the memory controller 112 can determine which of the multiple memory clients is allowed access to the memory at a particular time. Typically, at least one of the set of memory clients 104, 106, 108, and 110 corresponds to an isochronous memory client, which is one that expects data to be delivered in a periodic manner or in accordance with a baseline rate. As can be appreciated, untimely delivery of data to such an isochronous memory client can lead to a stall and degradation of a visual output.
As illustrated in FIG. 1, the GPU 100 also includes an array of dedicated buffers 114, 116, 118, and 120, which are connected between the memory controller 112 and the graphics pipeline 102. The array of dedicated buffers 114, 116, 118, and 120 are configured to store data retrieved by the memory controller 112 and to deliver the data to the set of memory clients 104, 106, 108, and 110. The array of dedicated buffers 114, 116, 118, and 120 can serve to reduce memory access latency by storing an advance supply of data to be processed by the set of memory clients 104, 106, 108, and 110. Each dedicated buffer has a fixed buffering space that is dedicated to store data for a particular memory client. Thus, for example, the dedicated buffer 114 is dedicated to store data for the memory client 104, while the dedicated buffer 116 is dedicated to store data for the memory client 106. As illustrated in FIG. 1, the GPU 100 also includes an array of dedicated Read/Write controllers (“R/W controllers”) 122, 124, 126, and 128, which are connected to respective ones of the array of dedicated buffers 114, 116, 118, and 120.
A significant drawback of the buffering implementation illustrated in FIG. 1 is that use of the array of dedicated buffers 114, 116, 118, and 120 and the array of dedicated R/W controllers 122, 124, 126, and 128 can lead to inefficiencies when a subset of the memory clients 104, 106, 108, and 110 is inactive or do not require all of its dedicated buffering space. For example, whenever a particular memory client is inactive or is operating with reduced memory access requirements, its dedicated buffering space is unused or underused. Such unused or underused buffering space translates into wasted buffering space that otherwise could have been assigned to another memory client that is active or is operating with enhanced memory access requirements. Also, such unused or underused buffering space translates into inefficient use of valuable die area on a chip.
It is against this background that a need arose to develop the apparatus, system, and method described herein.