(1) Field of the Invention
The present invention relates to a method and apparatus for adaptively compressing and transmitting streaming video over a network. As used herein, the terms “streaming video” and “video stream” are used to refer to video media made up of a continuous sequence of individual image frames that are transmitted over a network from a source to a destination such that the image frames are received at the destination at approximately the same frame rate they are transmitted by the source (i.e. in “real time”).
(2) Background of the Invention
Transmitting streaming video over a network, such as an intranet or the internet, presents numerous challenges. A main limiting factor is the bandwidth (commonly measured in bits/second) available to a particular video stream. Typically, the amount of data contained in an uncompressed, high definition video stream (sometimes referred to as the “bit rate,” quantified, like bandwidth, in bits/second) exceeds the bandwidth available for transmission of that video stream over a particular network. For successful transmission, the effective bit rate of the transmitted video stream must be reduced to fit the available bandwidth.
Various encoding and compression methods have been devised to reduce the amount of data of a transmitted video stream. As used herein, the term “encoding” means converting the data representing video stream (i.e. the individual image frames) from one representation form to another without necessarily changing the amount of data representing the video stream, while “compressing” means reducing the amount of data representing the video stream. For example, one form of encoding is to convert an image from one “color space” (e.g. RGB) to another (e.g. YUV). Such encoding does not itself reduce the size of the video media, but can result in a form that lends itself more readily to compression.
Compression/Encoding methods are typically implemented by a hardware and/or software device referred to as a “codec.” A particular codec may implement a single compression or encoding method or may combine several compression and encoding methods. Characteristics of a codec include compression ratio and compression quality. Compression ratio refers to the ratio of the size of the media after compression to its original size. Compression quality refers to how accurately the decompressed destination media recreates the source media. “Lossless” codecs compress the source media at the source and and decompress the received compressed media at the destination such that the decompressed media at the destination is an exact copy of the source media with no loss of information or quality. “Lossy” codecs achieve greater compression ratios than lossless codecs but cause some loss in information in the received, decompressed media. The human eye is more sensitive to some types of image attributes than others. Lossy codecs attempt to preserve information that the human eye is most perceptive of and limit lost information to information whose absence is less noticeable to the human eye. Codecs commonly have settings or parameters that can be varied to achieve different compression ratios, with higher compression ratios generally resulting in lower compression quality.
The degree to which an image can be compressed to achieve a given compression quality depends on the amount of detailed information in the image, sometimes referred to as the “entropy” of the image. Images, or regions of images, that have little texture and little variation in color (such as, for example, blue cloudless sky) have a low entropy. Image regions that have many color variations and texture (such as, for example, a meadow with multi-colored flowers) have a high entropy. Low entropy images (or image regions) can be compressed to a greater degree than high entropy images (or image regions) at a given compression quality.
Because the entropies of the individual images in video streams vary both within a given video stream and between different video streams, the compression ratio that will be ultimately achieved using a given codec with given settings will vary from video stream to video stream. Without prior knowledge of the entropy of each of the images of a particular video stream, it is difficult to choose the compression methodology and parameters needed to achieve a desired compressed bit rate for that video stream.
If the video stream is created from a previously recorded video media file, a “two pass” compression/encoding procedure can be used. During a first pass, the video media file is analysed to gather information about the entropy of each of the images in the file. During a second pass, that information is used to select the encoding/compression parameters that will produce the desired compression ratio while maximizing the compression quality.
Where the source of the video stream is not a recorded file but a live video feed, two pass encoding cannot be used. Instead, encoding/compression parameters must be chosen based on predicted entropy characteristics without actual knowledge of the true entropy characteristics of the video stream. As a result, optimum encoding/compression to meet a specified or available bandwidth is difficult to achieve. Instead, a video stream is likely to be either overcompressed (resulting in a reduction in video quality) or undercompressed (leading to dropped frames as a result of exceeding the available network bandwidth).
In addition to bandwidth limitations, additional challenges for transmitting video streams over a network are latency and network congestion. Data is typically sent over a network in the form of data packets. Source data is divided into individual packets of data, which may have a variable size, each of which contains addressing and other information needed for the network to convey the packet from the source to the destination. Network latency generally refers to the delay between the time that a data packet is transmitted at a source and the time the data packet is received at the destination. In the context of streaming video, for example when the video represents a live event captured by a video camera, latency may refer to the delay from the time that the live event occurs to the time the video stream portion showing that event is visible to a viewer receiving the video stream. Latency in the context of streaming video may be expressed, for example, in terms of time (i.e. seconds or milliseconds) or in terms of frames.
What is needed is a video steam encoding/decoding method and apparatus that adaptively adjusts to the changing entropy of images in a video stream to optimize the quality of the video stream when transmitted over a network at a given transport bandwidth.