Video quality testing has been performed for many years. Prior to the advent of digital compression techniques for video, formal subjective testing had been used with a relatively stable set of standard methods. In brief, a number of non-expert observers are selected, tested for their visual capabilities, shown a series of tests scenes for about 10 to 30 minutes in a controlled environment, and asked to score the quality of the scenes in one of a variety of manners. Usually, for full reference testing, the reference sequence is shown first, followed by the sequence under test, and the viewer is asked to rate the sequence under test with respect to the reference sequence. Further details of subjective measurements can be found in the relevant standard ITU-R BT.500 “Methodology for the Subjective Assessment of the Quality of Television Picture”. This standard was first issued in 1974 and is formally known as CCIR Rec.500, and version 7 of this document covers the past proposed methods for subjective testing.
There are some advantages of subjective testing using human viewers, in that valid results may be produced for both conventional and compressed television systems, and it can work well over a wide range of still and motion picture applications. However, there are clear disadvantages in that the precise set up of the test can affect the result obtained, that meticulous set up and control are required, and that in order to obtain statistically significant results a great many human viewers must be selected and screened. These disadvantages render subjective testing complex and time consuming, with the result that whilst subjective tests may be applicable for development purposes, they do not lend themselves to operational monitoring, production line testing, or the like.
In order to get around the disadvantages of human subjective testing as described above, therefore, it is also known in the art to provide for the automatic assessment of video quality, using automated, and usually computer based, video comparison techniques. A prior art system which performs automatic picture quality analysis is the PQA 300 system from Tektronix Inc of 14200 SW Karl Braun, P.O. Box 500, Beaverton, Oreg. 97077 USA. The PQA 300 works by measuring a two second portion of a five second video test sequence. The video test sequences may be downloaded from CD ROM or recorded from video, and played out to the system under test. The output of the system under test is then stored and analysis thereof performed with DSP accelerated hardware on the two second sequence. The measurement results in a single numeric value of picture quality called the “picture quality rating”. The PQA 300 employs a human vision system model known as JND Metrix and performs three different types of analysis of the video information, being spatial analysis, temporal analysis, and full colour analysis, in order to generate the picture quality rating. Additionally, the PQA 300 provides PSNR values which are displayed in the form of an animated map whose intensity is related to the PSNR differences between the reference and the test images. In summary therefore, the PQA 300 is able to analyse test and reference video sequences in order to generate a video quality value, as well as PSNR measurements.
Problems can arise, however, with straightforward comparisons of test and reference sequences to generate the quality metrics mentioned above. For example, spatial or temporal misalignment between the whole or parts of the reference and the test sequence can greatly affect such measurements, but may be perceptually insignificant to a human viewer. Such misalignments must be handled if difference measures are to contribute to reliable and practical full reference assessments.
Constant spatial and temporal misalignments are commonly encountered in full reference test situations, and can be countered by “one off” alignment applied to the whole reference or degraded sequence. Examples of prior art documents which deal with such one off alignments are U.S. Pat. Nos. 6,483,538, 6,259,477, 5,894,324, 6,295,083, and 6,271,879. Additionally, field-based spatial or temporal jitter, where misalignments might vary between fields, can be handled by similar techniques applied on a field by field basis. However, more complex, but equally imperceptible, misalignments may also occur within a field or frame, where different regions of a video field or frame might be subject to different shifts, scaling, or delay. For example, spatial warping, missing lines, or frozen blocks can occur through video processing and need to be taken into account of if a picture quality assessment metric is to be produced automatically which can be used in place of human subjective testing results.