A large number of multimedia (audio and/or video) data processing algorithms have been developed for various purposes. Typically, a multimedia processing algorithm may have several parameters to be tuned in order to achieve the best performance. At present, selections of parameter values for a given algorithm tend to be determined by a small number of algorithm developers. However, it is noted that the parameter value preference of a given algorithm may be content specific. That is, a fixed parameter value may be suitable for a certain set of content but not for all the possible multimedia content. As a result, different multimedia data may need to be processed in different ways. For example, a dialog enhancement method is usually applied on movie content. If it is applied on music in which there are no dialogs, it may falsely boost some spectral sub-bands and introduce heavy timbre change and perceptual inconsistency. Similarly, if a noise suppression method is applied on music signals, strong artifacts will be audible.
In light of the above facts, several solutions have been developed to dynamically adapt the configuration of multimedia processing algorithms as a function of the processed multimedia content. For example, in the audio field, there has been presented a method to automatically steer the audio processing algorithms and select the most appropriate parameter values based on the content categories (such as speech, music, and movie) of the processed audio signal.
However, in some cases, steering multimedia processing by classifying multimedia content into predefined categories may not optimize user experience. It would be appreciated that these categories such as speech, music, and movie do not necessarily link the perturbations in the underlying algorithm with the preferred user experience. For example, some speech content and music content may have similar or same effects on the human perceptions and therefore should be processed with similar processing parameters. In this event, processing them with different parameters may instead put negative impact on the user experience.
In view of the foregoing, there is a need in the art for a solution capable of processing multimedia content with optimized experience in terms of human perceptions.