Historic advances in computer technology have made it economical for individual users to have their own computing system, which caused the proliferation of the personal computer (PC). Continued advances of this computer technology have made PCs very powerful but also complex and difficult to manage. For this and other reasons, there is a desire in many workplace environments to separate user interface devices, including the display and keyboard, from the application processing parts of the computing system. In this preferred configuration, user interface devices are physically located at the desktop, while processing and storage components of the computer are placed in a central location. The user interface devices are then connected to the processor and storage components over a computer network.
A number of methods and devices have been developed to physically separate the user interface from the data processor using either proprietary transmission links (e.g. fiber) or dedicated digital links (e.g. standard CAT5 data cabling independent from the corporate LAN cabling). Examples of these methods include disclosures by Green et al. in U.S. Patent Application 20030208779 and Thornton in U.S. Pat. No. 6,633,934. Furthermore, commercial keyboard video and mouse (KVM) systems offered by Avocent and others provide similar capabilities. All of these physical layer extension methods are incompatible with existing workplace networks and therefore do not fulfill the primary objective of lowering infrastructure and maintenance costs by supporting remote users over the LAN infrastructure.
In order to understand the impact that a separated user interface may have on human perception, it is helpful to understand the system behavior for each existing method from the perspectives of system latency and quality of the visual experience. Shortcomings in prior art techniques are identified through an explanation of how these methods manage the communications of the graphics data to the remote display.
FIG. 1 illustrates the graphics path of a traditional PC. A graphical image is displayed after a sequence of events occurs. First, a software application running on data processor 10 sends graphic or drawing commands to operating system (O/S) 11. O/S 11, a graphics driver and graphics hardware then process these commands and generate an image that is stored in frame buffer 12. Display controller 13 monitors frame buffer 12 and sends the image to display 14 as a raster sequence. This raster sequence is periodically refreshed. Whenever the contents of frame buffer 12 are changed by the application, the modified image is sent to display 14 the next time display controller 13 reads that part of frame buffer 12. To create smoother image transitions, an application or O/S 11 may synchronize its image updates with the raster timing of display controller 13.
There are three basic methods of communicating the display image from a data processor across a standard network to a remote display. The first method is referred to herein as the graphics command transfer method. Rather than being drawn on the data processor, the graphics commands from an application are transferred over a network to a user interface served by a remote computing system. The remote computing system's O/S, graphics driver and hardware execute the graphics commands to create an image that is stored in a remote frame buffer. The remote display controller reads the image from the remote frame buffer and sends it as a raster to the remote display. There are a number of variations on this technique. The X Windows™ interface is one example of an application that acquires graphics commands at a high level and transfers them to a remote user interface. A second example is Remote Desktop Protocol (RDP) that converts most of the graphics commands to simple low-level primitives before transferring them to the remote user interface.
In the case of a simple remote computing system, the graphics command transfer method works adequately for transferring simple images. A few simple graphics commands are communicated across the network and the resultant network traffic is low. However, for complex images, the number of commands needed increases significantly. This increases network traffic and system latency. Additionally, the image commands also become complex. This increases the required complexity of the remote computing system (i.e. O/S, graphics driver and hardware). This in turn increases the cost, maintenance and support requirements for the remote user interface, in direct conflict with the original motivation for centralization—reduced support requirements for remote computers.
To accommodate low-complexity remote user interfaces, an alternate graphic command transfer method that may be used. This method converts the graphics commands to simple commands before transferring them. One problem with this method is that overall graphics capabilities are severely constrained by the low-complexity graphics capabilities of the remote system. This is due to high-level graphic commands that leverage graphics hardware acceleration functions in typical computing platforms no longer available in the simplified command set.
A second problem is that converting commands to simple commands is performed by the data processor and is a processing intensive function. The result is that the conversion process slows down the data processor and reduces the performance of the system.
A variation on the graphic command transfer method is disclosed by Duursma et al. in U.S. Patent Application 20030177172 entitled “Method and System for Generating a Graphical Display For a Remote Terminal Session.” In this approach, an application is capable of recognizing screen images components as either being graphic commands or bitmaps. Graphic commands are handled similarly to the method described above. However, when a bitmap is identified, a compressed data format of the bitmap is retrieved and transmitted to the remote terminal session in place of the original bitmap. While this feature adds bitmap capabilities to the command transfer method, the command processing overheads persist so little overall improvement to the graphics command processing is realized.
The second method for separating the user interface from the data processor referred herein as the frame buffer copy method. This method solves the drawing performance problem described above by using the O/S, graphics driver and hardware acceleration to draw the image into the frame buffer on the data processor side of the network. The image is then copied to a remote frame buffer at the remote user interface. This frame buffer is then read by the remote display controller and sent as a raster to the remote display.
Given that there is no direct network connection to the frame buffer, various methods exist to overcome the problem of transferring the image from the source frame buffer to the remote frame buffer. For example, virtual network computing (VNC) provides a solution that uses a software application at each end. An application on the data processor side reads the frame buffer, encodes the image and then sends it to the decoder application at the remote user interface where it is decoded by the VNC application and written into the remote frame buffer.
To reduce latency associated with updating image changes, the encoder continuously monitors the frame buffer and immediately sends any updates to the decoder. The biggest problem with this technique arises during times of complex image generation. Given the encoder software runs on the same processor as the drawing application, the processor becomes loaded with both encoding and drawing operations, which slow down the drawing speed and degrades the user experience.
A second problem with this method arises as a result of multiple, asynchronous frame buffers. When areas of the remote frame buffer are updated out of synchronization with the source frame buffer, the image viewed at the remote display is different from the intended image in the case of the display being connected directly to the source frame buffer.
A variation on the VNC software method is a server management product disclosed under U.S. Pat. No. 6,664,969 to Emerson, et al., entitled “Operating System Independent Method and Apparatus for Graphical Remote Access.” Emerson uses a separate hardware module to read the frame buffer, compress and send the image to an application at the remote user interface. This variation removes the encoding software load, but consumes the system bus of the data processing sub-system each time the frame buffer is read. In cases where real-time frame updates are required, the load on the system bus directly compromises the performance of the data processor and slows down the application. As with the VNC software method, this method has display continuity problems associated with synchronizing multiple frame buffers or pointers.
To provide a positive user experience, PC architecture has been designed for a well-timed image interface at the display controller output. Therefore, the best existing techniques are derived from the image being captured at this point. By obtaining the image at the display controller output, these implementations solve the problem of loading the processor, but also introduce additional problems.
A third method for separating the user interface from the data processor is referred to herein as the frame capture and transfer method. In this approach, the display controller of the data processor outputs a standard analog video signal. A frame capture circuit samples the video signal and captures the image into a capture frame buffer, one frame at a time. The image in the capture frame buffer is then transferred over the network to the remote frame buffer. The transfer operation is performed by an image-encoding application that accesses the capture frame buffer and compresses the image before sending it over the network. An application at the remote end decompresses the image and writes it into the remote frame buffer. This solution is suitable for applications where not all frames need to be processed or when the processing is allowed to take multiple frames.
However, a significant shortcoming of this approach is high delay and bandwidth consumption that is introduced into the display path when every frame image is captured and then processed out of the capture frame buffer memory. Another problem with this method is that frame-capture circuits lack the ability to detect the image characteristics, such as sampling frequency, image size and timing. Rather, these frame capture circuits have predefined capture timing and do not adapt to the changeable image stream characteristics defined by the display controller.
Yet another shortcoming of this solution is that frames are dropped or repeated and images may be torn where the display shows half of one frame and half of a previous frame. These undesirable effects are a result of two independent display controllers operating from different clocks. Even when set to the same frequency, clock variations result in the display controllers on each side of the network running at slightly different refresh rates.
Also, current implementations of this approach use analog connections, which is subject to sampling errors and creates noise. The noise introduces significantly more data to be transferred across the network, further increasing system latency.
A major shortcoming shared by this approach and others described above is that the image cannot be optimized to the capabilities of the display. Specifically, given there is no bi-directional connection between the display controller of the data processing system and the remote display, the display controller is unable to detect the display's capabilities.
The disadvantages of the prior art limit the usefulness of the frame capture and transfer method to low performance support or administrative computer systems with low-resolution displays.
Finally, another prior approach is to use a video encoder to capture every frame of a video stream and compresses it for storage or transmission. The first problem with this video encoder method is that it does not provide the same spectrum and resolution of a computer display because of its incompatibility with the RGB signals used in a computer display. Rather, the video encoder works with video luminance and chrominance signals.
The second problem with the method is that it uses lossy video compression techniques, which are not suitable for some of the images found in a computer display. The third problem with this method is the compression methods introduce multiple frame delays whereas the delay introduced by an encoding system must only be a fraction of a frame period. The fourth problem with video encoders is they do not support the full range of image sizes available for a display controller or a computer monitor. Rather they work with predefined image sizes. The fifth problem with this method is similar to a problem described for other methods. It is a broadcast method that captures a defined image size and frame rate and therefore does not allow the display controller to query the remote display and adjust its timing, frame rate or size to match the display.
In addition to the shortcomings described above, the physical network impacts the data transfer and consequently the user experience. Standard corporate LANs use packet-based methods for data communications, which adds additional performance constraints when applied to high-data video and graphics applications. There are two protocols available for data communications over a standard packet network. The first is the connection-oriented TCP protocol that guarantees the delivery of the information at the expense of data transfer performance. The second protocol is the connectionless UDP protocol that is preferable for real-time communications due to higher throughput. However UDP does not guarantee the delivery of data across the network. The frame transfer methods described above inherently lack real-time performance and therefore can afford the additional latency associated with using TCP protocol for graphic transfers.
The communication of real-time images over packet networks using connectionless protocols such as UDP is common practice in video communications including video conferencing and video content delivery across the Internet. H.261 and MPEG4 are examples of standards for supporting these applications. To minimize the effects of network-induced data loss, these protocols incorporate forward error correction methods and redundancy mechanisms. These protocols also incorporate data reduction methods. For example, rather than transmitting every frame associated with a video sequence, the MPEG protocol transmits an intra-coded reference image known as an I-Frame followed by a series of change vectors based on a future predicted frame. The decoder then builds a series of sequential frames using much less data than if each frame were transferred independently across the network.
The greatest shortcoming of this approach is that erroneous data may be used as the baseline for future frames causing errors to be propagated into these future frames and resulting in the display of distracting artifacts. These lossy compression techniques used are suitable for video but not for high definition graphics of computer displays. A second problem is that some encoding schemes introduce further communication latency that further degrades the user experience for a computer remote user interface.
One approach to limiting residual error propagation involves transmitting the graphics image as a sequence of sub-frames and using a feedback command channel to authorize the retransmission of corrupt or lost sub-frames. Such a technique is disclosed by Ran in U.S. Pat. No. 5,768,533 entitled “Video Coding Using Segmented Frames And Retransmission to Overcome Channel Errors.” In this approach, the receiver waits for all of the sub-frames that make up an image to be correctly received and then displays the frame. This may be an adequate method for some wireless video applications but the lack of display synchronization between the original video signal and the display frames introduces jitter and adds variable latency which makes it unsuitable for remote computer displays.