Video (and audio) has moved into the digital age, wherein still pictures and video are now imaged by CCD solid state devices instead of “analog” chemical films, and processed by microprocessors. One side effect of the digitization process is that very large amounts of data are produced, which may be larger than the memory capacity of the digital imaging devices. Lossless and lossy compression algorithms have been developed to reduce these large amounts of data to a manageable size. Unfortunately, completely lossless compression reduces the size of files by only about 50% or less. Much higher compression ratios are needed, which require lossy compression methods. It is the goal of such methods that the resulting losses would not be noticeable to the human visual system. While several standards have been developed with these goals in mind, some unintended artifacts may be produced in the resulting images, which may be visually noticeable. Many of today's video encoders attempt to reduce or remove distorting elements that would likely be missed by a human viewer, but this is a matter of subjective judgment.
It is desirable to quantify the degree of distortions. Compression distortion is often quantified using a “before and after” technique wherein a computer compares frames before and after encoding. This requires access to the original frames. Others have created “single-ended, reference free” techniques, but these techniques cannot discern small distortions, which renders them unsuitable for professional work. These techniques are also sensitive to the underlying video content. Test signals such as “color bars” and “multiburst” have been in use for many years, but were designed to test degradations of analog signals, and are thus unsuitable for testing digital compression systems.
Another class of distortions involves “transformations” and “reverse transformations” of colors in images using incompatible transmission standards. For example, each colored pixel in a video image is represented as a set of three values corresponding to red, green, and blue (RGB). It is common practice to perform a reversible matrix transformation on a first set of red, green, and blue values to obtain a different set of values, e.g., Y′CbCr prior to transmitting or encoding the Y′CbCr values, then transforming the Y′CbCr values back to RGB values. One advantage of performing this type of transformation is that the resolution of the two color difference arrays, Cb and Cr, can be reduced as a form of compression. There exists more than one reversible transform for this purpose. For many years the “Rec.601” standard (an abbreviation for ITU-R Recommendation BT.601) was used for standard definition video, but with the introduction of high definition television, a new transform was defined for high definition video, known in the art as “Rec.709” (an abbreviation for ITU-R Recommendation BT.709). Unfortunately, these two standards are incompatible, i.e., encoding with one standard while decoding with the other standard introduces color errors. It would be desirable detect when such a transform mismatch has occurred.
Another issue related to reversing RGB to Y′CbCr is the creation of test patterns in the Y′CbCr domain for analyzing the performance of equipment that processes video in that domain. Further, it is desirable to test the Cb and Cr channels independently from the Y′ channel. For example, a test pattern may be synthesized so that the Y′ “luma” component has a constant value, while the Cb and Cr components are varied in some way that will challenge the equipment under test. Such a pattern may be termed an “iso-luma” pattern. An example is shown in FIG. 1, which is from a Snell and Wilcox test pattern. FIG. 1 shows the alternating green 4 and purple 6 bands typical of a signal applied equally to both Cb and Cr channels.
Unfortunately, if an “iso-luma” pattern was transcoded from, say, the “Rec.601” domain to the “Rec.709” domain (or vice versa), the luma component would not likely remain constant, and the iso-luma property of the pattern would be lost in the transcoding operation. It would be desirable for a test pattern to remain iso-luminant despite having been transcoded from one domain to another domain.
Since the introduction of “talkies” (movies with sound), synchronizing the image sequence and the sound has been an issue. The film sound pickup head was located beyond the lower loop, and a loop that was too small would result in the sound being played too early. The error is most easily detected in scenes where people are talking, because their lips would be out of synchronization with the sound. Thus, a timing offset between video and audio streams, called “lipsync error” would result.
Analog television systems did not suffer much from lipsync errors until the advent of video processing techniques that used one or more frames of delay, such as a frame synchronizer or a digital video effects unit (DVE). As the cost of electronic storage has dropped, more and more frames of delay are included in the processing path of the video. Since audio is usually handled in a separate signal path during production, the equipment that introduced the video delay usually has no means of correcting for the video delay in the audio path. Video compression systems use 0.5 to 1.0 second or more of buffering in the process of encoding and decoding video. Although compression standards are quite clear about how to avoid lipsync errors, mistakes creep in. Lipsync errors can accumulate as the video and audio progress through an equipment chain.
It is desirable to be able to quantify the lipsync error. Further, it is desirable to quantify this error simply by observing a video/audio test sequence, without resorting to companion equipment.
One method in the prior art for indicating lipsync error is a test sequence wherein a rotating “clock hand” passes through the vertical position at the same time an audio event, such as a tone or click, is heard. This pattern helps determine that lipsync error has occurred, but does not quantify the error. Other products have been developed which include a frame synchronizer with an integrated audio delay, and have included a built-in test sequence to facilitate adjustment of the delay. Other proposed test sequences include a video element that flashes at the time of a tone; some equipment manufacturers have proposed companion test equipment (a light sensor and a microphone) to facilitate error measurement. But such companion equipment is, at times, inconvenient and depends primarily on a trial-and-error approach to correct lipsync error.
Accordingly, what would be desirable, but has not yet been provided, is a method for generating more numerically and visually discriminative test patterns for effectively and automatically quantifying losses in at least one of video and audio equipment.