1. Field of the Invention
The invention relates generally to computer systems, and in particular, to computer systems that employ a method for minimizing central processing unit (CPU) memory latency while transferring streaming data.
2. Background Information
Computer systems that employ a CPU often utilize a memory controller and a graphics controller. The memory controller controls access by the CPU and other agents to a system memory. The graphics controller controls the display of data provided by the CPU onto a display screen, such as a cathode ray tube (CRT), using a frame buffer. Both the system memory and the frame buffer are typically implemented using arrays of Dynamic Random Access Memory (DRAM). In some computer systems, the frame buffer and the system memory are unified into a single shared memory, known as a Unified Memory Architecture (UMA).
Computer systems such as these have traditionally processed all requests for access to memory as asynchronous requests, which have included requests involving graphics data. Asynchronous requests are generally at a non-deterministic rate (e.g., random). An example of when an asynchronous request is generated is when an action, such as a mouse click or a keystroke, from an input/output (I/O) device causes an interrupt. In response to the interrupt, the CPU makes one or more asynchronous requests to access memory in order to store its current operation and to locate instructions associated with servicing the interrupt.
The time associated with accessing memory, retrieving the requested data from memory, and making the retrieved data available to a requesting agent is sometimes referred to as xe2x80x9clatency.xe2x80x9d Asynchronous requests are generally latency-sensitive. That is, the quality of service degrades as the length of time to access the memory and to process the request increases. For example, it is undesirable to computer users to wait an inordinate amount of time before their mouse click results in activity. Accordingly, conventional computer systems attempt to reduce latency as much as possible by granting asynchronous requests from the CPU priority over other memory requests.
Isochronous memory requests have become increasingly common in recent years. Examples of isochronous transactions include audio, video, or other real-time data transfers to or from I/O devices that use xe2x80x9cstreamingxe2x80x9d technology such that the data is processed as a steady and continuous stream. Streaming technology is commonly used with the Internet, for example, where audio or video is played as the streaming data is downloaded, which is in contrast to some computer systems where an entire file has to be completely downloaded before being played.
Isochronous requests, in contrast to asynchronous requests, are deterministic. That is, the amount of information needed in a given period of time or the rate of information that is transferred in a given period of time are generally known. For instance, when writing a video image to a display screen from a frame buffer, it is known that the video frames are sent to the display screen at a rate of 30 frames per second, and so the number of lines per second, bits per line, bytes per pixel, etc. are known. Isochronous requests are generally more tolerant of a specific latency value but are very sensitive to extreme variations in latency, even if these extremes occur infrequently. Once an isochronous stream begins, continuous data transfer becomes important and must be maintained. Therefore, the measure of quality in isochronous data transfer is defined by the amount of data that can be lost without significantly affecting the audio or video quality. Lost data is directly related to extreme latency variations. Extreme latencies can cause data loss. If the data cannot be accessed in time, it is no longer useful.
Traditional computer systems have relied on various forms of priority-based memory arbitration, including priority, round-robin sequencing, time slice limits, high watermarks, etc., to determine the order in which an agent requesting access to memory should be serviced. While these kinds of arbitration schemes do function to reduce CPU memory latency, audio, video, and other streaming I/O memory traffic are typically given lower priority, which can therefore cause a streaming agent to be xe2x80x9cstarved outxe2x80x9d or sufficiently delayed in accessing memory, thereby resulting in lost data. Assigning higher priority to streaming I/O memory traffic results in an improvement of latency for the streaming data, but doing so is at the expense of increased CPU memory latency. Accordingly, improvements are needed in the scheduling and processing of mixtures of asynchronous and isochronous memory requests.
An embodiment of the present invention provides a memory arbiter having a first counter to decrement a service period associated with a memory request of a first type and a second counter to decrement a service period associated with a memory request of a second type. The memory arbiter also has a scheduler logic circuit coupled to outputs of the first and second counters. The outputs of the first and second counters are indicative of time remaining in corresponding service periods for the first and second types of memory requests. The scheduler logic circuit has inputs to receive memory requests of the first and second types, and generates a grant signal to service a received memory request of the second type if the output of the second counter indicates that time remains in the service period associated with the memory request of the second type or if there are no pending memory requests of the first type.