The present invention relates to a video content processing system.
Many applications require quality evaluation of video images. Such evaluations can be subjective or objective. Subjective quality evaluation techniques for video images are fully specified in the ITU-T Recommendation BT.500. The Recommendation provides a methodology for numerical indication of the perceived quality from the users' perspective of received media after compression and/or transmission. The score is typically expressed as a single number in the range 1 to 5, where 1 is lowest perceived quality, and 5 is the highest perceived.
Currently there are two main types of objective video degradation measurement processes:
1. Full reference methods (FR), where the whole original video signal is available
2. No-reference methods (NR), where the original video is not available at all
Devices and processes of both types can be used in off-line systems (file-based environment) as well as in on-line systems (live video transmission).
The most widely used FR video quality metric during the last 20 years is Peak Signal-to-Noise Ratio (PSNR). PSNR is used approximately in 99% of scientific papers, but only in 20% of marketing materials. The validity of PSNR metric is limited and often disputed. This also applies to all PSNR derivatives, such as Structural similarity (SSim) and many others.
Among NR metrics the best known objective metric is PAR (Picture Appraisal Rating) algorithm jointly developed by BBC and Snell & Wilcox Ltd, UK. PAR is a single-ended (No-Reference) algorithm calculating the weighted sum of DCT quantization errors. PAR is calculated in the transform domain; the complete reconstruction of decompressed images is not required. The PAR error values are taken directly from the compressed stream data and the measurement results correlates with the subjective scores. Mapping PAR value (expressed in dB) to subjective 1 to 5 score is very simple:
PAR=50 dB is equivalent to subjective score 5 (excellent quality or imperceptible impairment), and PAR=30 dB is equivalent to subjective score 3 (fair quality or slightly annoying impairment).
Some commercially available software tools, such as Snell & Wilcox “Mosalina”, utilize PAR for the objective measurement of compression artifacts.
DVD-making facilities have worked with PAR for quite a long time. USA Tektronix, Inc. produces “Cerify”. This PC-based product combines several off-line tools, including the compression quality meter similar to “Mosalina”. “VQMA” by VideoQ, Inc., the assignee of the present invention, is a stand-alone NR class video quality analyzer requiring the presence of special test pattern at the video processing chain input. VQMA is suitable for measurement of scaling and color space conversion artifacts—image geometry, image aspect ratio, levels of color components, frequency response, aliasing levels, etc. The weighted sum of VQMA measurement results can be mapped to composite distortion score expressed in dBqi. VideoQ also produces “StresTracker” dynamic test pattern suitable for objective measurement of compression artifacts by FR and NR meters.
One drawback of all abovementioned tools is that they provide only external connectivity and a-posteriori analysis. These tools are not integrated with the transcoding engine, thus they are not capable to provide early warning messages and feedback signals during the transcoding process. They only flag-out the degree of picture quality loss already introduced by the transcoding engine; so the user is forced to answer difficult question: does it make sense to re-start the whole file encoding/transcoding process from the very beginning?
Some mezzanine (post-production) encoders are capable to output metadata in a “texting” format, i.e. in form of short video fragment inserted before or after main video clip (media-info leader/trailer). Single alphanumeric page allows operator (compressionist) to see on a standard video monitor screen all details of the last codec settings, thus eliminating the need for any special analyzers or metadata readers. However, these auxiliary data describe only details of the last codec settings, not the complete processing history of the particular piece of content.
Certain software can provide built-in means of compression quality logging and on-the-fly reporting. For example, FFmpeg is an open-source software (GNU General Public License) that produces libraries and programs for handling multimedia data. FFmpeg reports on-line the quantization scales (Q values) used for each MPEG slice, thus providing for some degree of operational feedback. But it does not report the amount of damage (summary of DCT coefficients errors) caused by the application of these scales. However, FFmpeg's built-in quality meter uses the same unreliable PSNR values (FR metric) for objective quality loss reporting. Thus, the FFmpeg built-in PSNR meter cannot be used in the scaling mode, when the output picture size differs from the input picture size. Thus, conventional picture quality loss measurement technologies cover only some aspects of the problem.
On the other hand, long-term efforts of best experts in the analog broadcast TV resulted in the development of quite sophisticated algorithms for objective measurements closely correlated to subjective assessments. For instance, widely used K-factor (measured on Pulse & Bar test pattern) represents a maximum of 6 partial measurements. K-factor covers luminance sharpness, overshoots, ringing (ghost images) and white level non-uniformity (line tilt). The factor represents picture quality degradation as a single value, closely related to perceived picture quality. Mapped to 5-points subjective picture quality scale K=0% is equivalent to subjective score 5 (excellent quality) and 4% is equivalent to subjective score 3 (fair quality). This highly successful technique can be re-used with application to modern digital transcoding technologies.
Of course, the actual partial measurement algorithms and the scaling coefficients used to combine partial results and map them into one final score should be re-engineered, but the general approach and overall QC scheme may be inherited.
FIG. 1 illustrates a prior art video content processing system block diagram. It should be noted that prior art systems typically use external sources of test patterns and external devices to measure the quality loss due to the transcoding of video content.
Referring initially to FIG. 1, input video content package 102 (often referred as “container”) typically contains descriptive metadata 104 as well as main video content data 106 of at least one video stream, e.g. in the MXF format it contains metadata and multiple streams wrapped together.
In test mode this input package 102 is replaced by the test stream 108, which may represent static or dynamic test pattern, or even short video clip—so called “reference video”.
Via input selector 110 input data is fed to the video transcoder 112, typically consisting of several cascaded blocks, such as decoder 114 for decompressing video data to the baseband YUV/RGB format, scaler 116 for allowing desired modification of video frame size and/or video frame rate, pre-processor 118 to remove some picture components undesirable for the current encoding mode, and compression encoder 120 for compressing and formatting video data into a standard video stream, e.g. into MPEG2 transport stream or MXF stream/file.
Output video content package 122 also contains descriptive metadata 124 as well as processed video content 126, which in turn consists of at least one video stream.
In test mode the output package 122 is replaced by the processed test stream 128, which may represent static or dynamic test pattern, or reference video clip.
Transcoder 112 operates under control and supervision of an automated Media Asset Management System 130; in some applications the robotic part of MAM System 130 is combined with or replaced by human Operator (Compressionist). In any case transcoder usually works as a slave receiving from outside the Transcoding Preset 132 (set of the most important controls).
Quality Control in prior art system is available in two non-exclusive forms:
(a) Visual check and visual comparison of input image vs. output image on the screens of two video monitors 134 and 136, connected to the corresponding points of the transcoder 112,
(b) Instrumental analysis of the discrepancies between input image and output image, performed by the Analyzer 138, containing a Picture Distortions Meter 140, which outputs QC Report 142; in some embodiment variants of the prior art system the Analyzer 138 may check only quality of the input stream or only output stream quality without any comparison of two streams.
Neither of two abovementioned QC tools is suitable for closed-loop transcoder settings optimization or automated distortions history logging/tracking. They provide “post-factum” analysis, suitable only for the initiation of re-work request in case of some serious image quality loss.
In modern Content Delivery Networks (CDN) video stream 126 of the output package 122 via the CDN “cloud” 144 feeds the video player 146 connected to the final destination display 148.
In some (advanced) embodiments of the prior art system the transcoder 112 via Media Assets Management System 130 may receive additional data from the destination player 146.
For example, if the network conditions (e.g. its instant transmission capacity, sometimes called “instant bandwidth”) are temporarily worsening, then transcoder may compensate such network congestion by performing a short-term reduction of the outgoing stream bitrate. This creates an optional feedback control loop, shown on FIG. 1 which works as described below.
Current network conditions (Quality of Service) are permanently assessed by the player 146. Through a back-end communication channel 150 and Media Assets Management System 130, player assessment data are transferred to the transcoder 112, thus closing the performance optimization loop.
The implementation of the abovementioned optional optimization loop of the prior art systems does not involve any assessment of the actual content features or picture quality loss due to the transcoder operation. It is limited to the selection of the single item within a short list of pre-defined transcoder presets. Each such preset is pre-assigned (mapped) to the single item within the short list of expected network conditions.
The very important parameters of audio-visual content processing and delivery system are:                spatial and/or temporal scaling of moving images and related artifacts        color space conversion and related artifacts        compression artifacts introduced by a chain of concatenated codecs.        
Together these parameters define the degree of overall picture impairment (quality loss), which depends mainly on factors of picture degradation and/or picture distortion on the way from the content originator via content distribution channel(s) to the content consumer.
To optimize the system operation and provide high quality content output it is highly desirable to introduce economically viable, fast, and reliable methods providing for objective measurement of video quality at every step of video data processing workflow.
Important feature of modern digital content delivery system is that the abovementioned quality parameters may vary depending on:                Area location within the image (spatial profile)        Current timeline position (temporal profile)        Test point within the data transmission path (cumulative impairment profile)        
The quality impairment estimation becomes even more complicated when the video content processing system uses modern sophisticated solutions such as variable frame-rate, variable bitrate and/or switchable video frame size.
Many broadcast TV, IPTV and content delivery experts, still think that the main problems are in the fields of inter-operability and compatibility, plus well known issues linked with the Quality of Service. Quality of Service should not be confused with the Quality of Experience, though quite often service providers state than their Quality of Service is great, deliberately avoiding the Quality of Experience issues.
Due to the astonishing progress in the Quality of Service area, majority of interoperability problems are being solved. Unified server-based solutions are on the way. In the most optimistic variant the message is “Produce and pack your content in a format compatible with some platform, all other functions will follow automatically”.
However, the Quality of Experience issues, i.e. the issues related to handling of picture quality degradation in concatenated scalers/codecs, are mainly unresolved. Quality Assurance implies Quality Control (QC), which in turn requires reliable, repetitive and objective measurements.
Video content re-purposing and delivery system QC should be fully automatic for checking thousands of channels and hundreds of formats semi-automatically is not an economically viable option. Periodic checks, polling schemes and similar techniques are unreliable—miss probability is unacceptably high. Deployment of thousands of stand-alone monitoring devices is unrealistic and uneconomical. Thus, fundamentally different video transcoding workflows and video QC technologies are needed.