Video signals, such as high-definition television (HDTV) signals, are increasingly being transmitted over networks that have no quality-of-service guarantees, such as Internet Protocol networks. Consequently, there is information loss as video streams across these networks.
In a movie theater, a film reel rolls in front of a light at (usually) 24 frames per second. If someone were to look at any single frame in the film, he or she would see a complete picture. Film is like a “flip-book” in which each drawing is slightly different than the one before it. Digital video works differently.
Digital video is compressed in order to get the highest quality of video with the smallest amount of data possible. Frequently, video compression systems work by sending a single “frame” of a video as an image and—instead of sending the entire image of the next frame—the system sends only the difference between the single image and the image preceding it. By sending only the difference between the two images, the amount of data required to transmit the next image is less than sending a whole image.
When packets of data are lost in a network, often the instructions of how to take one image and add the differences between images are lost. This leads to images being rendered improperly. If the amount of data loss is small, then the difference between the original image and the rendered image that a viewer sees may be too small for an individual to notice. Even if the differences are noticeable, a person would probably be unable to quantify the differences between the two images.
For the purpose of this specification, the term “fidelity” is defined as the degree to which a network path accurately reproduces the sound or image of its input information.
For example and without limitation, the differences in the original image and the rendered image are a result of losses of fidelity. Loss of fidelity may result from various issues, for example and without limitation, the quality of service along a network path, other network conditions, losses inherent to compression, problems with compressing the information, etc. The data such may be, for example and without limitation, video, images, telephony, streaming audio, any time-sensitive data, any quality-of-service sensitive data, etc.
For the purposes of this specification, a “network path” is defined as a pair of source and destination nodes in a network.
The service provided by a network path is characterized by its “quality of service,” which, for the purposes of this specification, is defined as a function of the bandwidth, error rate, and latency, and their time derivatives, of the network path.
For the purposes of this specification, the “bandwidth” from one node to another is defined as an indication of the amount of information per unit time that can be transported from the first node to the second. Typically, bandwidth is measured in bits or bytes per second. For the purposes of this specification, the “error rate” from one node to another is defined as an indication of the amount of information that is corrupted as it travels from the first node to the second. Typically, error rate is measured in bit errors per bit or packets lost per packet. For the purposes of this specification, the “latency” from one node to another is defined as an indication of how much time is required to transport information from one node to another. Typically, latency is measured in seconds.
There exist several methods to determine the fidelity. There are subjective techniques, for example and without limitation, asking an individual, asking a group and taking an average (“mean opinion score”), etc. There are also objective techniques. These techniques fall into three main categories: full-reference, reduced reference, and no-reference techniques.
In full reference techniques, the original image is compared to the received image using image processing techniques. Hence, full reference techniques require access to both the original transmitted and the received video sequences. These measurements are computationally intensive.
Reduced reference techniques extract various features from both the original and the distorted video sequences and compare the extracted features of the original and the distorted images to each other. While the comparison of only the extracted features reduces the computational overhead, it may still be computationally intensive to extract the features from the source video. Additionally, the extracted features of the original image would still need to be sent across the network and synchronized to the received image. For off-line network assessment, pre-defined set of images can be used.
No-reference techniques use only the received distorted image. These techniques can be pixel-based or bitstream based and are more suitable for both in-service monitoring and off-line network assessment for video quality. Pixel-based techniques look for known distortions in the images to assess quality. However, they cannot handle video sequences with unanticipated distortions. Bitstream-based techniques are computationally lighter since they do not require decoding. Assessment of video quality is computationally simple. However, additional upfront computation is required.
All of the above methods require large amounts of computation. Because the computational requirements of all these methods are high, they can only be used sparingly without causing undue burdens on processing resources.