3D graphics rendering systems, such as gaming PCs and gaming devices follow a standard architecture that typically includes:                1. CPU subsystem—it includes the main processor, memory and storage        2. Graphics subsystem—it includes the graphics processor (GPU) and associated memory.        3. A Display subsystem that is connected to the GPU        
The CPU subsystem and the GPU subsystem are typically connected through a high speed bus, such as PCI, AGP or PCI-Express. The GPU subsystem is typically connected to the Display through another high speed interface such as HDMI, DVI, or Display Port. The role of these components can be thought of as the CPU being responsible for describing the content at an abstract level and the GPU being responsible for rendering the content in pixel form. The Display is then responsible for visually displaying the pixels to the user.
Typically, the main program generating the graphics, such as a game program, is run on the CPU where the game program listens to user input from keyboard or game pad. The game program executes the game logic and then sends commands to the GPU telling the GPU how to create a picture (also called a frame or image) that will be shown on the Display. This process is repeated several times every second to create an appearance of smooth motion on the Display. Typically it is repeated 30 times a second. This figure is also known as refresh rate.
It is the GPU's job to execute the commands sent by the CPU. Commands can be roughly categorized as “simple commands” that the GPU can execute by itself, “indirect commands” that refer to data residing in the CPU's memory (known as System Memory), or commands that read data generated by the GPU.
Typically the volume of data going from the CPU to GPU, and the system memory to GPU, far outweighs the data going from the GPU to CPU. The performance of the GPU, and therefore the quality of the gaming experience, is directly proportional to the number of frames the GPU can process per second. Thus, the data transfer bandwidth between the CPU/System Memory and the GPU plays a crucial role in this performance. If the interface between the CPU and GPU is constrained, this data transfer can be a bottleneck that will hurt performance. The pace of innovation in this interface (ISA, PCI, AGP, PCIE 1.0, PCIE 2.0, PCIE 3.0) has been brisk. A typical gaming system today has bandwidth of up to 4 Gbytes/Second.
The nature of the CPU-GPU and the GPU-Display interface has required that the CPU, GPU and Display be part of the same system to guarantee the best performance. This limitation has implications for system design, such as power consumption, size, portability, cooling requirements and noise. For these and other reasons, there is interest in the graphics community to find ways to physically separate the CPU, GPU and Display, in a way that does not require re-writing of applications. Possible solutions range from physical separation at the electrical level, to software solutions that operate at higher levels.
An example solution involves housing the GPU in a separate chassis from the CPU, while continuing to use the PCIE interface to form a connection between the CPU and GPU. This allows the GPU to be scaled independently of the CPU. The drawback, however, is that the electrical requirements of this interface are such that the cable that connects the CPU to GPU cannot be longer than a few feet.
Another possible solution works as follows. In this approach, a portion of the graphics processing takes place at a server system and the remainder of the processing takes place at a client system. A server thin software layer accesses a server GPU's Frame Buffer (the memory where the pixels reside). The pixels in the server GPU's Frame Buffer are then compressed and sent to the client system. The compression is typically lossy because the bandwidth requirement for lossless compression is too high. At the client system, a client piece of software puts the pixels in the client GPU Frame Buffer. The client Display then displays these pixels. This approach is known as Frame Buffer Remoting. The obvious disadvantage of this approach is the loss of visual quality that results from lossy compression.
Yet another solution exists, that delivers better visual fidelity than Frame Buffer Remoting. In this approach, a portion of the graphics processing again takes place at a server system and the remainder of the processing again takes place at a client system. A server thin software layer intercepts graphics commands that go from the server CPU to the server GPU. The graphics command stream is optionally compressed. The compressed graphics commands are sent over an IP network to the client system. A client software layer retrieves the compressed graphics commands and sends them to a client GPU. The client GPU then executes these commands and displays the picture on a client display. One disadvantage of this approach is that the bandwidth requirement can become very high because the data transfer rate requirement from the server to the client is very high. Another drawback of this approach is that the client CPU is mostly wasted, because the server CPU is doing most of the work.
When looking at these approaches, it becomes obvious that it is not possible for these approaches to satisfy both visual quality and low bandwidth constraints simultaneously. That makes it infeasible to deploy these approaches for demanding applications such as games, especially on today's broadband networks where the bandwidth is limited. Even where it is possible to deploy such approaches without bandwidth constraints, such approaches will result in wasting the CPU capacity of the client system.