Current operating systems typically include a graphical drawing interface layer that is accessed by applications in order to render drawings on a display, such as a monitor. The graphical drawing interface layer provides applications an application programming interface (API) for drawings and converts drawing requests by such applications into a set of drawing commands that it then provides to a video adapter driver. The video adapter driver, in turn, receives the drawing commands, translates them into video adapter specific drawing primitives and forwards them to a video adapter (e.g., graphics card, integrated video chipset, etc.). The video adapter receives the drawing primitives and immediately processes them, or alternatively, stores them in a First In First Out (FIFO) buffer for sequential execution, to update a framebuffer in the video adapter that is used to generate and transmit a video signal to a coupled external display.
Client-server based solutions are becoming more and more widespread. In such solutions, the client has relatively limited to very limited resources, e.g., in terms of limited processing power, limited memory resources, or limited power supply. Also, the communication channel between server and client device almost always tends to be limited, having limited bandwidth, or high latency.
Client-server based solutions include remote desktop management, thin client applications, but also applications streaming video signal into a client device such as smart phone or, for example, client devices such as settop box (STB), including for Internet TV or Internet Protocol TV (IPTV). Such applications include TV programming, VOD, video games, video communication solutions, video surveillance solutions, etc.
While mainframe—terminal solutions have been around since the very beginning of computers, dedicated lines were used to connect mainframe to the monitor. Today the challenge is that for video signal, often an Internet Protocol (IP) based connection channel is used to connect server and client. Applications are hosted on remote servers (or virtual machines running thereon) in a data centre. A thin client application installed on a user's terminal connects to a remote desktop server that transmits a graphical user interface (GUI) of an operating system session for rendering on the display of the user's terminal. One example of such a remote desktop server system is Virtual Computing Network (VNC) which utilizes the Remote Framebuffer (RFB) protocol to transmit framebuffers (which contain the values for every pixel to be displayed on a screen) from the remote desktop server to the client. While from the client device requirements viewpoint, the goal is that all the image processing is performed on a server side, it would result in extremely large amounts of raw data to be transmitted from server to client device (e.g., for an image resolution of 1920×1200 and color depth of 24 bits per pixel at a rate of 60 times per second would require transmitting 3.09 Gb/s (gigabits per second). One approach is to use spatial compression, i.e., each frame is compressed using lossless or lossy encoding, such as Discrete Cosine Transform (DCT).
Additionally, only those parts of the frame that have changed compared to previous frame should be transmitted from the server to the client. For that purpose, the frame is divided into subparts and on server side, each subpart is compared with the subpart of a previous frame. For example, current frame is saved in a primary buffer and a previous frame is saved in a secondary buffer. Solutions exist were only those areas (blocks) are updated from primary buffer to a secondary buffer that have changed.
Both spatial and temporal compression is required in the server side, resulting in need on the client side to easily decode the signal without overburdening the client device.
Encoding and decoding are widely used in video transmission. MPEG2 and MPEG 4 (H.264) are industry standards and widely used. Video signal is first encoded, using both spatial and temporal compression. Spatial compression is performed within a frame similarly to compression used for JPEG and is based on DCT, that describes a set of pixels by a set of superimposed cosine waveforms. DCT is applied to 8×8 pixel blocks. Additionally, temporal compression is used. MPEG2 uses 3 types of frames I, B, P, I frame is fully encoded frame. P is predicted frame, based on I frame. P can be decoded only after previous I frame is decoded. B is bi-directionally predicted frame, based on both I and P frame. Further, in addition that there are 3 types of frames, each type of frame comprises blocks that can be I, B or P type. I frames contain only I type blocks, P frames contain I or P type blocks and B type frames contain I, B or P type blocks. Additionally, each macroblock (16×16 pixels) can introduce a motion vector, useful, e.g., for camera panning. Most client devices, such as settop boxes, smart or mobile phones, thin clients, etc., usually include MPEG2 and/or MPEG4 decoder.
Known is U.S. Pat. No. 7,649,937 (published as US2006/0020710), disclosing a method of to deliver real-time video data over the Internet. A streaming processor receives raw video data from a video source; the video data is compressed by grouping pixels into blocks and comparing blocks of adjacent (i.e., consecutive in time) frames of video data to identify any changes. Only blocks that have been changed are transmitted. In addition, if a block has changed to a previously transmitted block, then only an identification index for the block is transmitted. The actual content of the block can then be recreated by comparing the index to a list of previously received blocks. This method requires storing on a server at least two adjacent (consecutive) frames and comparing such frames pixel by pixel, or block by block to identify any changes.
Known is GB2318956, disclosing a display screen duplication system and method for maintaining duplicate copies of all or a portion of a display screen at two or more computer systems. The display screens are segmented into a two-dimensional matrix of blocks. A value, e.g., CRC, is computed for each of the blocks and stored with a pointer to the corresponding block of the display screen. Changes in the display screen are detected by repeatedly calculating the values and comparing with previously stored values for the corresponding block. When the values are different, the pointers are temporarily stored until a predetermined period of time or all the blocks have been checked. When at least one of these criteria is met, adjacent blocks are transmitted as a group, preferably using compression. This method requires repeatedly comparing consecutive display screens block by block.
Known is U.S. Pat. No. 4,823,108, describing a method for displaying information in overlapping windows on a video display of a computer controlled video display system independent of the operating system of the computer. The computer program output display data can be written within windows on the video display without substantial modification of the application program by writing such data to a pseudo screen buffer for temporary storage. The contents of the pseudo screen buffer are then compared with the contents of a previous image buffer at selected, timer-controlled intervals. At memory locations where the data differs, the differing data are written into the previous image buffer. As display data is thereby identified and periodically updated, it is displayed in selected windows. This method requires comparing consecutive image buffers and updating the image buffer accordingly.
Known is WO00/65464, disclosing a system and method for controlling information displayed on a first processor-based system, from a second processor-based system. The apparatus comprises a memory to store instruction sequences by which the second processor-based system is processed, and a processor coupled to the memory. The stored instruction sequences cause the processor to: (a) examine, at a predetermined interval, a location of a currently displayed image; (b) compare the location with a corresponding location of a previously displayed image to determine if the previously displayed image has changed; (c) transmitting location information representing the change; and (d) storing the changed information on the first processor-based system. Specifically the CPU keeps a record of the location of the most recent changes, and examines those locations more frequently. This technique is based on the assumption that a change will very likely occur close the location of a most recent change involving an input/output device activity.
Known is U.S. Pat. No. 5,241,625, disclosing a system for remotely controlling information displayed on a computer screen by intercepting output events such as graphics calls. Graphics commands which drive a computer window system are captured and saved as a stored record or sent to other computers. A message translation program translates the captured messages for playback on a designated computer.
Known is U.S. Pat. No. 5,796,566, disclosing a system in which sequences of video screens forwarded from a host CPU to a video controller, are stored and subsequently retrieved by a terminal located remote from the host CPU. In particular, display data is captured in a local frame buffer which stores the display data frame by frame. A previous frame or screen of display data is compared with a current frame or screen of display data to determine if a change has occurred.
Known is U.S. Pat. No. 7,882,172 (published as US2007/268824), disclosing a thin client system for a high-quality picture reproduction method for using a thin client terminal as TV phone terminal and a TV conference terminal. The method (FIG. 4 of the patent) comprises on a screen data transmission side: initializing the screen block table e.g. by setting each block data of the table to a default value; if an update is detected in the screen information, control enters a loop to transmit the differential data to the remote controller, comprising sequentially reading screen information from the VRAM from the upper-left block to the lower-right block; in the first screen monitor loop, obtaining screen block data corresponding to the block number designated by the VRAM; next, comparing the screen block data with data of the associated block number stored in the screen block table; if the data matches with the data stored in the table, it is determined that the screen has not been updated. Control returns to processing to acquire next block data; if it is determined as a result of data comparison that the data does not match each other, it is recognized that the screen has been updated and the obtained screen block data is stored as the value of the associated block number; the screen block data is compressed; the compressed block data is sent together with the block number to the remote controller; the sequence of processing steps are repeatedly executed at a predetermined interval of time; on a screen data reception processing side, the screen block data is received and is written into VRAM-CL to thereby display an updated screen on the display of the terminal; first, the block number and the screen block data are received; the screen block data compressed as above is expanded or decompressed; the decompressed data is written in an associated area of the VRAM-CL corresponding to the block number. As a result, the contents of the screen update are presented on the display; finally, the sequence of processing steps are repeatedly executed until the process is terminated. It is possible that only the blocks in which a change takes place in the screen is efficiently transmitted.
This may be considered the closest known solution. However, according to this method, the received blocks are first expanded or decompressed and then stored in a VRAM-CL. Such method cannot be used or has no advantages if the thin client device is equipped with industry standard video decoder such as MPEG2 or H.264.