Video Compression
Video is typically captured as a sequence of frames, where each frame is—before compression—a separate still image. Uncompressed video requires considerable bandwidth for transmission because each pixel of each frame must be transmitted; this can require very expensive transmission facilities. It is therefore desirable to compress video for transmission from a compressing system to a decompressing system.
Many common video compression algorithms, including many variants of Motion Picture Experts Group (MPEG) video compression, operate by breaking a video stream into a sequence of I and P frames.
An initial frame (I-frame), sometimes also known as a key frame, is a full image that has been captured compressed and transmitted by the first computer. A predicted frame (P-frame), is an image that has been compressed by determining differences between the current frame and a prior frame held in a frame buffer—typically an I-frame or a previous P-frame—of the video, these differences are then compressed, coded and transmitted. Since only a small portion of each image changes from frame to frame in a typical video sequence, P-frames typically can be encoded with far fewer bits than an I-frame.
A bidirectionally predicted frame (B-frame) is an image that is compressed by encoding differences from both a previous and a following frame. Some variants of the MPEG standard call for compressing video into a repeating sequence of I-frames followed by a sequence of alternating B- and P-frames.
Encoded B- and P-frames are typically much smaller than encoded I frames. A video stream compressed as a sequence of I-, P- and B-frames therefore typically requires far fewer bits than does a video compressed as a sequence of I-frames of similar quality.
Video Conferencing
Video conferencing has become increasingly popular in recent years for both educational and business applications. Video conferencing generally requires that both a video and an audio stream be transmitted in realtime between locations that can be many miles apart. Both unidirectional and bidirectional video conferencing systems are known. Since high-bandwidth connections are not always available between locations at reasonable cost, it is desirable to minimize the bandwidth required for video transmission. It is therefore desirable to minimize the number of I-frames that must be transmitted.
During decompression of a video, should a frame be corrupted, such as when packets are dropped during transmission or when a new viewer first joins a videoconference and has no prior frame; following B- and P-frames will be corrupted. Further, this corruption will continue until the corrupt data is replaced in the frame buffer, such as when an I-frame is received.
Video conferencing systems are known wherein the video stream is examined for points where large differences occur between frames, such as at scene changes, and I-frames are transmitted only at these points. With systems of this type, I-frames may occur rarely, they may be separated by hundreds of B- and P-frames. Since video conference transmissions are also often transmitted at low frame rates, image corruption may persist for tens of seconds.
Since an I-frame requires many more bits than a typical B- or P-frame, transmission of I-frames into a low-bitrate realtime video transmission causes a burst of data needing transmission. These bursts can interfere with transmission, causing interference with audio, as well as causing visible artifacts such as momentary freezing of parts of the screen. It is desirable to minimize these bursts while transmitting video.
In “Robust H.263 Video coding for Transmission over the Internet.” Willebeek-LeMair, et. al., INFOCOM 1998: 225-232 available at http://www.ieee-infocom.org/1998/papers/02c—4.pdf, it is proposed that, instead of transmitting complete I-frames, a sequence of macroblocks (herein I-blocks) be transmitted instead. The H.263 referenced in this title is the H.263 specification for transmission of compressed video in videoconferencing applications published by the International Telecommunications Union. These I-blocks represent encoding a portion of a frame in full, while remaining portions of the frame are typically encoded as P-blocks based upon previous frames. Successive I-blocks encode differing portions of the frame in full, such that as successive frames are transmitted an entire frame buffer is updated. In Willebeek-LeMair, it is proposed that I-blocks be inserted into a video stream based upon their impact on future frames. The mechanism of Willebeek-LeMair is applicable to unidirectional videoconference systems. The system of Willebeek-LeMair poses difficulties in realtime or bidirectional video conference system because future frames are not always known in these realtime systems.
Many videoconference systems operate by capturing video in a compression device, then compressing and transmitting the captured video.
Computer Displays
Videoconference systems often transmit computer display information as compressed video to a remote decompression system. This computer display information may take the form of a remote desktop. The computer display information may include graphics, and may include video in a window.
Specialized products for compression and transmission of an image of part or all of a computer display to at least one other computer system also have been marketed.
Windows Screen Paint
Programs operating under Microsoft Windows typically use screen paint system messages to control refreshing or writing to windows on the display. These WM_PAINT and WM_NCPAINT messages may be passed by the operating system to a program to instruct the program to refresh part or all of its portion of the screen in display memory. Programs may also call procedures for invalidating portions of a screen that in turn cause a WM_PAINT or WM_NCPAINT message to be sent to themselves.