Historic advances in computer technology have made it economical for individual users to have their own computing system, which caused the proliferation of the Personal Computer (PC). Continued advances of this computer technology have made these personal computers very powerful but also complex and difficult to manage. For this and other reasons there is a desire in many workplace environments to separate the user interface devices, including the display and keyboard, from the application processing parts of the computing system. In this preferred configuration, the user interface devices are physically located at the desktop, while the processing and storage components of the computer are placed in a central location. The user interface devices are then connected to the processor and storage components with some method of communication.
There are various methods for communicating the display image from a data processor across a standard network to a remote display. These methods are described below, identifying the problems with each method that is solved by this invention
FIG. 1 shows the architecture for a data processing system that supports a remote display by transferring graphic commands across a network. In the diagram, central processing unit (CPU) 100 of the data processor is connected to various devices such as system memory 102 and network interface 104 by chipset 106.
CPU 100 uses a graphics application interface (G-API) such as OpenGL, GDI or others to draw a display image in the normal way but rather than being issued to a local Graphics Processing Unit (GPU), drawing processor or function, the graphics commands are trapped by software on the CPU and transmitted across network 108 to remote drawing processor 110.
Remote drawing processor 110 renders the display image in remote frame buffer 112. Remote display controller 114 then accesses the image in the frame buffer and provides a rasterized video signal for remote display 116. In a typical implementation, remote drawing processor 110 may be supported by a remote CPU, operating system and graphics drivers. In this case, the graphics commands are issued to the remote CPU which then draws the image using its local drawing capabilities and remote frame buffer 112 described. Variations on the graphic command transfer method include the transmission of different abstractions of graphic commands. X Windows is one example that captures and transfers high level graphics commands while RDP is another example that converts most of the graphics commands to simple low-level primitives before transferring them to the remote system. Regardless of the level of abstraction, a CPU sub-system is usually required at the remote system as an interface between the commands and the remote drawing function.
One problem with the use of low level commands with simple remote hardware is that the system graphics capabilities are constrained by the low-complexity graphics capabilities of the remote system. This is due to high-level graphic commands that leverage graphics hardware acceleration functions in a typical computing platform no longer being available in the simplified command set. In order to draw complex images using simple commands, the number of commands increases significantly which increases the network traffic and system latency.
Another problem with graphic command transfer methods is that the graphic commands may relate to the rendering of structures outside of the viewable area of the display. In these cases where graphic commands don't immediately change the displayed region of an image, unnecessary network traffic is generated to accomplish the remote rendering. A third problem is that converting commands to simple commands is performed by the data processor and is a processing intensive function. The result is that the conversion process slows down the data processor and reduces the performance of applications running on the data processor.
To avoid out-of-order problems associated with sequentially dependent drawing commands, graphic commands need to be transmitted using a reliable network protocol (e.g. TCP/IP). A shortcoming of systems that support complex graphics commands is that these systems required increased complexity of the remote computing system (i.e. O/S, graphics driver and hardware). The result is an increase in cost, maintenance and support requirements for the remote user equipment which is in direct conflict with the original motivation for centralization i.e. reduced burden in supporting the remote display system.
The second method for separating the user interface from the data processor is the frame buffer copy method. This method solves the drawing performance problem described above by using the operating system, graphics driver and optional graphics drawing hardware features of the data processing system to first draw the image in a frame buffer on the data processor side of the network before transferring it. Frame buffer copy methods transfer also have the advantage of using a faster best efforts transfer methods (e.g. UDP/IP) rather than slower more reliable methods such as TCP/IP. This is because the data does not have the sequential dependence of graphic commands and it is easy to recover from occasional errors caused by lost data packets.
FIG. 2 shows the architecture for a data processing system that supports a remote display by copying either compressed or uncompressed bitmaps from a frame buffer across a network. In the diagram, the CPU of the data processor 200 is connected to various peripheral devices including system memory 202, network interface 204 and optional dedicated GPU or drawing processor 206 by chipset 208. As above, the CPU uses a G-API to draw an image. Graphic commands are issued to drawing processor 206 that renders the image in frame buffer 210. Alternatively, the drawing processor might not be a dedicated device but rather a function of the CPU or chipset and the image may be drawn in an area of system memory 202.
Once an image has been rendered in the frame buffer, a software application on the CPU or a peripheral hardware component accesses the frame buffer and copies partial or complete frames across network 211 to remote frame buffer 213. In cases where the frame buffer data is compressed prior to transmission, it is decompressed by software or hardware-based remote decoder 212 before being stored in remote frame buffer 213.
The remote display controller 214 accesses the image, generates a raster signal and displays the image on remote display 216.
In the graphic command transfer method of FIG. 1 and the frame buffer copy method of FIG. 2, remote display controllers 114 and 214 which generate the raster signal that drives the display are controlled by a remote entity. Specifically, remote drawing processor (110 in FIG. 1) or remote decoder (212 in FIG. 2) provides display setup and configuration commands, usually with the support of a remote operating system and graphics driver.
Neither of the methods discussed above support a direct network connection between the frame buffer and the network interface. Consequently, various methods exist to overcome the problem of transferring the image from the frame buffer of the data processor to the remote frame buffer.
For example, VNC is a software product that uses a software application at each end of the network. An encoding application on the data processor reads the frame buffer, encodes the image and then sends it to the decoder application at the remote user interface where it is decoded by the VNC application and written into the remote frame buffer.
A major shortcoming of this technique arises during times of complex image generation. Given the encoder software runs on the same processor as the drawing application, the processor becomes loaded with both encoding and drawing operations that slow down the drawing speed and degrades the user experience. A second shortcoming of this method arises as a result of asynchronous host and remote frame buffers and the fact that the application does not precisely track all screen changes and catch all events on the data processor as might be the case if every refresh of the frame buffer were captured. As a result, the image viewed at the remote display becomes different from the intended image whenever areas of the remote frame buffer are updated out of synchronization with the source frame buffer at the data processor.
OpenGL VizServer™ from Silicon Graphics is another product that uses software applications at each end of the network. Unlike VNC, VizServer is capable of capturing every updated frame buffer by reading the viewable region of every frame into the system memory of the CPU once it has been rendered in the frame buffer. This is achieved by monitoring the G-API for frame buffer refresh commands such as glFlush( ) Once in system memory, the frames are encoded and transmitted across the network to a remote system that requires a minimum of a thin client decoder with drawing capabilities. One problem with this method is that it is CPU intensive. For example, VizServer optimally requires one dedicated CPU for reading the frame buffer, one for managing the network interface and two more dedicated processors to support the compression of the image in system memory. A second problem is that this method uses a software approach to image compression. General purpose CPU's are not optimized around pixel-level image decomposition or compression but are limited to generic block-based color reduction or difference calculation techniques which results in both lower compression ratios and poorer image quality at the remote display. A third problem with CPU-based encoding systems is that they use the network interface of the data processing system for the transmission of display image data. In cases where the same network interface is also used for connectivity of other real-time traffic streams with the remote system (e.g. peripheral traffic such as audio, USB or IEEE 1394 data) and other CPU-bound traffic, the network interface becomes a system bottleneck, packets are either delayed or dropped and the user experience at the remote system is degraded.
A variation on the software-based frame buffer copy approaches such as VNC and OpenGL VizServer described is a screen scraper hardware solution disclosed under U.S. Pat. No. 6,664,969 to Emerson, et al. entitled “Operating System Independent Method and Apparatus for Graphical Remote Access.” This method uses a separate hardware module to read the frame buffer, compress the image and send it to an application at the remote user interface. This approach removes the encoding software load, but also consumes the system bus of the data processing sub-system each time the frame buffer is read. In cases where real-time frame updates are required, the load on the system bus directly compromises the performance of the data processor and slows down the application. As with the VNC software method, this method has display continuity problems associated with synchronizing multiple frame buffers or pointers.
There are also variations on the above methods that provide a combination of graphic commands and bitmap transfer functions to enable the remote display of computer display images. One such variation is disclosed by Duursma et al. in U.S. Pat. Application 20030177172 entitled “Method and System for Generating a Graphical Display for a Remote Terminal Session.” In this approach, an application on the data processor is capable of recognizing screen images components as either being graphic commands or bitmaps. Graphic commands are handled similarly to the graphic command transfer method described above. However, when a bitmap is identified, a compressed data format of the bitmap is retrieved and transmitted to the remote terminal session in place of the original bitmap. While this feature adds bitmap capabilities to the command transfer method, the command processing overheads persist so little overall improvement to the graphics command processing is realized.
None of the remote display methods described above evaluate the encoding of the image in the context of other data streams that share the network. For example, if the display image incorporates a video frame in one region only, there is no attempt by the frame buffer encoder or the graphics command parser to optimize encoding for that region based on other traffic priorities.
The methods for providing synchronization between a data processor and a remote display system that have been introduced above fall into two categories with respect to display update policies. Push models use the host display timing to determine when to send screen updates from the host to the client. There are both graphic command protocols (such as RDP and Citrix ICA) and frame buffer copy methods (such as Sun Ray) that synchronize display updates to the host timing. Of these, some (such as RDP and ICA) queue the display updates and then send them at regular intervals. Lai and Neih in “Limits of Thin Client Computing” found these methods to deliver poor quality for video display sessions. Others, such as Sun Ray and X send the updates as soon as the server system issues a window system command and have been found to deliver better video performance. As discussed previously, all of these methods require a client system capable of interpreting the graphic commands and rendering the display image.
Client-pull protocols such as the VNC frame buffer copy method use the client to provide the synchronization and the client requests display updates from the host as needed. The client-pull model has the advantage of being capable of adjusting to network bandwidth availability and client processing ability. If the client or network causes a delay, updates are only requested when the client is ready and the overall performance is inherently scaled. The major downside of this method, as described by Lai and Neih, is that a noticeable delay is incurred when real-time applications are such as video are executed across the network, to the point of significantly degrading the video quality. The reason for the performance degradation is that a 66 ms request delay is inserted every time a new frame is requested. In applications such as video where large changes occur in each requested frame, the request mechanism and network delays cause a bottleneck in the streaming process and quality is degraded.
In summary, existing methods incur significant software and hardware processing overheads and are unable to ensure synchronization between the data processor and remote systems. Those methods that first queue commands at the host before sending them do not support video well. Those that immediately send rendering instructions and provide adequate video quality require a CPU and software at the remote user. Those methods that copy frame buffers across a network do not perform well when there are significant changes in the displayed image and large datasets need to be transmitted after each frame is requested. Therefore, there is a heartfelt need for a better method of accessing the frame buffer that does not impact the system drawing architecture and does not require a client CPU.