Historic advances in computer technology have made it economical for individual users to have their own computing system, which caused the proliferation of the Personal Computer (PC). Continued advances of this computer technology have made these personal computers very powerful but also complex and difficult to manage. For this and other reasons there is a desire in many workplace environments to separate the user interface devices, including the display and keyboard, from the application processing parts of the computing system. In this preferred configuration, the user interface devices are physically located at the desktop, while the processing and storage components of the computer are placed in a central location. The user interface devices are then connected to the processor and storage components with some method of communication.
There are various methods for communicating the display image from a data processor across a standard network to a remote display. These methods, described below, suffer significant shortcomings.
Drawing Command Transfer Method
FIG. 1 shows the architecture for a data processing system that supports a remote display by transferring drawing commands across a network. As illustrated, central processing unit (CPU) 100 of the data processor is connected to various devices such as system memory 102 and a network interface 104 by a chipset 106.
CPU 100 uses a graphics application interface (G-API) such as OpenGL, GDI or others to draw a display image in the normal way but rather than being issued to a local graphics processing unit (GPU), drawing processor or function, the drawing commands are captured by software on the CPU and transmitted across network 108 to remote drawing processor 110. Remote drawing processor 110 renders the display image in remote framebuffer 112. Remote display controller 114 then accesses the image in the framebuffer and provides rasterized video signal for remote display 116. In a typical implementation. remote drawing processor 110 is supported by a remote CPU, operating system and graphics drivers. In this case, the drawing commands are issued to the remote CPU which then draws the image using its local drawing capabilities and remote framebuffer 112.
Variations on the drawing command transfer method include the transmission of different abstractions of drawing commands. X Windows is one example that captures and transfers high level drawing commands while RDP is another example that converts most of the drawing commands to simple low-level primitives before transferring them to the remote system. Regardless of the level of abstraction, a CPU sub-system is usually required at the remote system as an interface between the drawing commands and the remote drawing function.
One problem with the use of low level commands with simple remote hardware is that the system graphics capabilities are constrained by the low-complexity graphics capabilities of the remote system. This is due to high-level drawing commands that leverage graphics hardware acceleration functions in a typical computing platforms no longer being available in the simplified command set. In order to draw complex images using simple commands, the number of commands increase significantly which increases the network traffic and system latency.
Another problem with drawing command transfer methods is that drawing commands may relate to the rendering of structures outside of the viewable area of the display. In these cases where drawing commands don't immediately change the displayed region of an image, unnecessary network traffic is generated to accomplish the remote rendering.
A third problem is that converting commands to simple commands is performed by the data processor and is a processing intensive function. The result is that the conversion process slows down the data processor and reduces the performance of applications running on the data processor.
The problem with systems that support complex drawing commands is that these systems required increased complexity of the remote computing system (i.e. O/S, graphics driver and hardware). The result is an increase in cost, maintenance and support requirements for the remote user equipment which is in direct conflict with the original motivation for centralization i.e. reduced support of the remote display system.
Framebuffer Copy Method
Another method for separating the user interface from the data processor is the framebuffer copy method. This method solves the drawing performance problem described above by using the operating system, graphics driver and optional graphics drawing hardware features of the data processing system to first draw the image in a framebuffer on the data processor side of the network before transferring it.
FIG. 2 shows the architecture for a data processing system that supports a remote display by copying either compressed or uncompressed bitmaps from a framebuffer across a network. In the diagram, the CPU of data processor 200 is connected to various peripheral devices including system memory 202, network interface 204 and optional dedicated GPU or drawing processor 206 by chipset 208. As above, the CPU uses a G-API to draw an image. Drawing commands are issued to drawing processor 206 which renders the image in framebuffer 210. Alternatively, the drawing processor might not be a dedicated device but rather a function of the CPU or chipset and the image may be drawn in an area of system memory 202.
Once an image has been rendered in the framebuffer, a software application on the CPU or a peripheral hardware component accesses the framebuffer and copies partial or complete frames across network 211 to remote framebuffer 213. In cases where the framebuffer data is compressed prior to transmission, it is decompressed by software- or hardware-based remote decoder 212 before being stored in remote framebuffer 213. Remote display controller 214 accesses the image, generates a raster signal and displays the image on remote display 216.
Neither of the methods discussed above support a direct network connection between the framebuffer and the network interface. Consequently, various methods exist to overcome the problem of transferring the image from the framebuffer of the data processor to the remote framebuffer. For example, VNC is a software product that uses a software application at each end of a network. An encoding application on the data processor reads the framebuffer, encodes the image and then sends it to the decoder application at the remote user interface where it is decoded by the VNC application and written into the remote framebuffer.
The most serious shortcoming of this technique arises during times of complex image generation. Given that encoder software runs on the same processor as the drawing application, the processor becomes overloaded with both encoding and drawing operations which slow down the drawing speed and degrades the user experience.
A second problem arises as a result of asynchronous host and remote framebuffers and the fact that the application does not precisely track all screen changes and catch all events on the data processor as might be the case if every refresh of the framebuffer were captured. As a result, the image viewed at the remote display becomes different from the intended image whenever areas of the remote framebuffer are updated out of synchronization with the source framebuffer at the data processor.
OpenGL VizServer from Silicon Graphics is another product that uses software applications at each end of the network. Unlike VNC, VizServer is capable of capturing every updated framebuffer by reading the viewable region of every frame into the system memory of the CPU once it has been rendered in the framebuffer. This is achieved by monitoring the G-API for framebuffer refresh commands such as glFlush( ). Once in system memory, the frames are encoded and transmitted across the network to a remote system that requires a minimum of a thin client decoder with drawing capabilities. One problem with this method is that it is CPU intensive. For example, VizServer optimally requires one dedicated CPU for reading the framebuffer, one for managing the network interface and two more dedicated processors to support the compression of the image in system memory. A second problem is that this method uses a software approach to image compression. General purpose CPUs are not optimized around pixel-level image decomposition or compression but are limited to generic, block-based color reduction or difference calculation techniques that result in both lower compression ratios and poorer image quality at the remote display. A third problem with CPU-based encoding systems is that they use the network interface of the data processing system for the transmission of display image data. In cases where the same network interface is also used for connectivity of other real-time traffic streams with the remote system (e.g. audio and USB traffic) and other CPU-bound traffic, the network interface becomes a system bottleneck, packets are either delayed or dropped and the user experience at the remote system is significantly degraded.
A variation on the software-based framebuffer copy approaches such as VNC and OpenGL VizServer is a screen scraper hardware solution disclosed under U.S. Pat. No. 6,664,969 to Emerson, et al. entitled “Operating system independent method and apparatus for graphical remote access.” This method uses a separate hardware module to read the framebuffer, compress the image and send it to an application at the remote user interface. This approach removes the encoding software load, but also consumes the system bus of the data processing sub-system each time the framebuffer is read. In cases where real-time frame updates are required, the load on the system bus directly compromises the performance of the data processor and slows down the application. As with the VNC software method, this method has display continuity problems associated with synchronizing multiple framebuffers or pointers.
Hybrid Variations
There are also variations on the above methods that provide a combination of drawing commands and bitmap transfer functions to enable the remote display of computer display images. One such variation is disclosed by Duursma et al. in U.S. Pat. Application 20030177172 entitled “Method and system for generating a graphical display for a remote terminal session.” In this approach, an application on the data processor is capable of recognizing screen images components as either being drawing commands or bitmaps. Drawing commands are handled similarly to the drawing command transfer method described above. However, when a bitmap is identified, a compressed data format of the bitmap is retrieved and transmitted to the remote terminal session in place of the original bitmap. While this feature adds bitmap capabilities to the command transfer method, the command processing overheads persist so little overall improvement to the drawing command processing is realized.
None of the remote display methods described above evaluate the encoding of the image in the context of other data streams that share the network or network availability. For example, if the display image incorporates a video frame in one region only, there is no attempt by the framebuffer encoder or the drawing command parser to optimize encoding for that region based either on other traffic priorities or external network conditions.
GPU as Encoding Processors
It has been suggested that the programmable section of a GPU be used to perform limited image encoding methods such as color cell compression or fractal compression described below. In one example, it was proposed that the GPU perform color cell compression encoding as a method for supporting remote display capabilities. One problem with this method is that color cell compression provides a limited compression ratio when compared with other compression methods available for computer display compression. As described above, the GPU's floating point vector processing engines are unsuitable for these pixel-oriented image processing methods.
A second problem with this approach lies in the dataflow through the graphic pipeline. To prevent data loop back, the back end of the GPU pipeline must be modified by replacing the standard video interface with an interface such as a network or system bus interface suitable for the compressed data stream. While the image encoder also requires a similar network connection, the data structures that interface with the network interface logic are optimized for compressed image data.
In another example, it was proposed that the GPU perform fractal compression, a lossy compression technique that exploits self-similarity in images. This approach shows that the GPU offers performance advantages over a general purpose CPU for some components of the fractal algorithm. While suitable for video or still image compression, fractal compression does not meet the high quality compression requirements required of high detail computer image information such as text and icons.
In summary, existing methods incur significant software and hardware processing overheads, are unable to ensure synchronization between the data processor and remote systems, and require a CPU and software at the remote user. A better method of accessing the framebuffer that does not impact the system drawing architecture is required.