The invention relates to the audio signal processing, and more particularly, to the enhancement of a desired portion of the audio signal for individual listeners.
Recent widespread incorporation of digital audio file archiving, compression, encoding, transmission, decoding, and playback has led to the possibility of new opportunities at virtually every stage of the digital audio process. It was recently shown that the preferred ratio of voice-to-remaining audio (VRA) differs significantly for different people and differs for different types of media programs (sports programs versus music, etc.). See, xe2x80x9cA Study of Listener Preferences Using Pre-Recorded Voice-to-Remaining Audio,xe2x80x9d Blum et al., HEC Technical Report No. 1, January 2000.
Specifically, VRA refers to the personalized adjustment of an audio program""s voice-to-remaining audio ratio by separately adjusting the vocal (speech) volume independently of the separate adjustment of the remaining audio volume. The independently user-adjusted voice audio information is then combined with the independently user-adjusted remaining audio information and sent to a playback device where a further total volume adjustment may be applied. This technique was motivated by the discovery that each individual""s hearing capabilities are as distinctly different as their vision capabilities, thereby leading to individual preferences with which they wish (or even need) to hear the vocal versus background content of an audio program. The conclusion is that the need for VRA capability in audio programs is as fundamental as the need for a broad range of prescription lenses in order to provide optimal vision characteristics to each and every person.
The invention enables the inclusion of voice and remaining audio information at different parts of the audio production process. In particular, the invention embodies special techniques for VRA-capable digital mastering and accommodation of VRA by those classes of audio compression formats that sustain less losses of audio data as compared to any codecs that sustain comparable net losses equal or greater than the AC3 compression format.
The invention facilitates an end-listener""s voice-to-remaining audio (VRA) adjustment upon the playback of digital audio media formats by focusing on new configurations of multiple parts of the entire digital audio system, thereby enabling a new technique intended to benefit audio end-users (end-listeners) who wish to control the ratio of the primary vocal/dialog content of an audio program relative to the remaining portion of the audio content in that program. The problems that motivate the specific invention described herein are twofold. First, it is recognized that there will be differing opinions on the best location in the audio program production path for construction of the two signals that enable VRA adjustments. Second, there are tradeoffs between the optimal audio compression formats, audio file storage requirements, audio broadcast transmission bit rates, audio streaming bit rates, and the perceived listening quality of both vocal and remaining audio content finally delivered to the end-listener. Various solutions to those two problems, for the ultimate purpose of providing VRA to the end-listener, are offered by this invention through new embodiments that may incorporate new or existing digital mastering, audio compression, encoding, file storage, transmission, and decoding techniques.
In addition, the invention may adaptive to the various ways that an audio program may be produced so that the so-called pure voice audio content and the remaining audio content is readily fabricated for storage and/or transmission. In this manner, the recording process is considered to be an integral component of the audio production process. The new audio content may be delivered to the end-listener in a transparent manner, irrespective of specific audio compression algorithms that may be used in the digital storage and/or transmission of the audio signal. This will require the inclusion of the voice and remaining audio information in virtually any CODEC. Therefore, this invention defines a unique digital mastering process and uncompressed storage format that will be compatible with lossless and minimally lossy compression algorithms used in many situations.
The embodiments of the invention may also focus on required features for VRA encoding and VRA decoding. Because of the commonality among audio codecs, all descriptions provided below can be considered to provide VRA functionality equally well for broadcast media (such as television or webcasting), streaming audio, CD audio, or DVD audio. The invention may also be intended for all forms of audio programs, including films, documentaries, videos, music, and sporting events.
With these and other advantages and features of the invention that will become hereinafter apparent, the nature of the invention may be more clearly understood by reference to the following detailed description of the invention, the appended claims and to the several drawings attached herein.