Advance in high-speed and broadband Internet access networks is raising expectations for spread of audiovisual communication services which transfer audiovisual media containing video and audio data between terminals or server terminals via the Internet.
Audiovisual communication services of this type use encoding communication to improve the audiovisual medium transfer efficiency, in which an audiovisual medium is encoded into a plurality of frames and transferred using intra-image or inter-frame autocorrelation of the audiovisual medium or human visual characteristic.
On the other hand, a best-effort network such as the Internet used for the audiovisual communication services does not always guarantee the communication quality. For this reason, in transferring a streaming content such as an audiovisual medium having a temporal continuity via the internet, narrow bands or congestions in communication lines are perceptible as degradation in quality, i.e., subjective video quality a viewer actually senses from the audiovisual medium received and reproduced via the communication lines. Additionally, encoding by an application adds encoding distortions to the video image, which are perceptible as degradation in subjective video quality. More specifically, the viewer perceives degradation in quality of an audiovisual medium as defocus, blur, mosaic-shaped distortion, and jerky effect in the video image.
In the audiovisual communication services that transfer audiovisual media, quality degradation is readily perceived. To provide a high-quality audiovisual communication service, quality design of applications and networks before providing the service and quality management after the start of the service are important. This requires a simple and efficient video quality evaluation technique capable of appropriately expressing video quality enjoyed by a viewer.
As a conventional technique of estimating the quality of an audio medium as one of streaming contents, ITU-T recommendation P.862 (International Telecommunication Union-Telecommunication Standardization Sector) defines an objective speech quality evaluation method PESQ (Perceptual Evaluation of Speech Quality) which inputs a speech signal. ITU-T recommendation G.107 describes an audio quality estimation method which inputs audio quality parameters and is used for quality design in VoIP (Voice over IP).
On the other hand, as a technique of estimating the quality of a video medium, an objective video image evaluation method (e.g., ITU-T recommendation J.144: to be referred to as reference 1 hereinafter) which inputs a video signal is proposed as a recommendation. A video quality estimation method which inputs video quality parameters is also proposed (e.g., Yamagishi & Hayashi, “Video Quality Estimation Model based on Display size and Resolution for Audiovisual Communication Services”, IEICE Technical Report CQ2005-90, 2005/09, pp. 61-64: to be referred to as reference 2 hereinafter). This technique formalizes the video quality on the basis of the relationship between the video quality and each video quality parameter and formalizes the video quality by the linear sum of the products. A quality estimation model taking coding parameters and packet loss into account is also proposed (e.g., Arayama, Kitawaki, & Yamada, “Opinion model for audio-visual communication quality for quality parameters by coding and packet loss”, IEICE Technical Report CQ2005-77, 2005/11, pp. 57-60: to be referred to as reference 3 hereinafter).