1. Field of the Invention
The invention relates generally to methods for encoding a video signal for communication across a transmission medium. More particularity, the invention relates to a method for identifying and encoding persistent regions of a digital rasterized computer display stream for transmission to a remote user interface across a standard computer network.
2. Description of the Related Art
Historic advances in computer technology have made it economical for individual users to have their own computing system, which caused the proliferation of the Personal Computer (PC). Continued advances of this computer technology have made these personal computers very powerful but also complex and difficult to manage. For this and other reasons, there is a desire in many workplace environments to separate the display from the application processing parts, or data processor of the computing system. In this preferred configuration, the display is physically located at the user's desktop, while the processing and storage components of the computer are placed in a central location. The display is then connected to the data processor and storage components with some method of communication. Applications relating to still image and generic video encoding are highly sophisticated and well published. However, the content and refresh characteristics of a computer display are different to video and still image transmission systems, leading to opportunities for improved encoding methods.
Still images such as photographs may be encoded using transform domain techniques that enable the progressive build of image bit planes at the client end of the network. Progressive image transfer (PIT) is a standard feature of the JPEG2000 specification and enables the early display of a reasonable quality image approximation at the client side of the network by first displaying the low spatial frequency components of the image, followed by a progressive build to a lossless image over a series of build frames. This approach lowers the peak bandwidth requirements for the image transfer compared with sending the whole image in a single frame. However, a fundamental shortcoming is a lack of support for dynamic images. Another shortcoming lies in the lack of encoding support for compound images comprised of text, pictures, background and high definition icon types.
Video transmission methods are tailored to the transmission of highly dynamic images at fixed frame rates and limited bandwidth. They are relatively insensitive to encode/decode delays and typically use encoding methods unrelated to this discussion. Hybrid variations such as M-JPEG transmit a series of independent JPEG images without applying inter-frame prediction methods typical of other video encoding methods such as MPEG-2 or H.264 etc. Consequently, these offer limited compression and tend to consume high network bandwidth in applications that mandate high frame rates. Therefore they remain best suited to specialized applications like broadcast resolution video editing or surveillance systems where the frame rate is low.
A few techniques have been developed specifically to support the transmission of display signals over standard networks. These methods attempt to address the problem of transmitting high bandwidth display signals from the processing components to the remote desktop in various ways. The simplest method is to periodically send copies of frame buffer information from the data processor. This is impractical for sending a normal resolution display image at a reasonable refresh rate. For example, an SXGA image frame of 1280×1024 at 24-bit resolution would take 0.3 seconds of dedicated 100 Base T LAN network bandwidth, making perception-free communications of display information impossible.
An alternative approach is to intercept graphics instructions on the data processor and communicate these across the network. However, this method is intrusive on the host system which requires operating system dependent graphic command routing software. Moreover, a processor and software capable of interpreting the graphics commands is required at the remote user interface which makes the method restrictive in its broad compatibility, adds cost and increase complexity to the remote installation.
In another approach, the data processor compares the previously transferred frame with the current frame and only transfer changes between them. This decreases the overall amount of data, especially for a computer display in which much of the display may be static from frame to frame. However, this approach is expensive to implement because the data processor requires at least two frame buffers namely a first containing a copy of the previously communicated frame and a second containing the present frame. Given that the previous frame must be compared with the present frame one pixel at a time, possibly requiring an additional temporary delta-buffer, this approach is both memory and computationally intensive. There is a noticeable decrease in the performance of applications running on the data processor, especially during applications such as video clips that involve significant screen refresh activity. This is caused by each screen refresh requiring the movement and copying of graphics information between the frame buffers across the local system bus of the data processor.
A variation of the frame comparison method reduces the overall data processor memory requirement by segmenting the frame buffer into tiles and maintaining a list of signatures for the tiles. The new frame is tiled and the signature for each new tile is compared with the signature in the list to determine if the tile should be transferred. These tiling and list methods are limited. They require hardware or application-based frame buffers tightly-coupled with the data processing architecture. System performance is impacted by the copying of pixels and signatures which loads the system bus. Software approaches interrupt the operating system so that background tasks can manage the activity. This further reduces the performance of the data processor. Existing tiled change detect methods are also limited in sophistication. Typically, an operation is only performed when the image has changed, in which case the operation is to send the new image.
In summary, existing still image and video compression techniques are not optimized for the high-quality and low latency encoding requirements of dynamic computer display images. Other methods developed specifically to transfer computer display images require intrusive components or a complex remote display system. This results higher equipment and maintenance costs and lower performance. Therefore, a better method for encoding computer display images that takes advantage of the characteristics of the environment is needed.