1. Technical Field
Embodiments of the present disclosure relate generally to controlling the concurrent playback of multiple media files and, more particularly, to a technique for adaptively ducking one of the media files during the period of concurrent playback.
2. Description of the Related Art
This section is intended to introduce the reader to various aspects of art that may be related to various aspects of the present techniques, which are described and/or claimed below. This discussion is believed to be helpful in providing the reader with background information to facilitate a better understanding of the various aspects of the present disclosure. Accordingly, it should be understood that these statements are to be read in this light, and not as admissions of prior art.
In recent years, the growing popularity of digital media has created a demand for digital media player devices, which may be portable or non-portable. In addition to providing for the playback of digital media, such as music files, some digital media players may also provide for the playback of secondary media items that may be utilized to enhance the overall user experience. For instance, secondary media items may include voice feedback files providing information about a current primary track that is being played on a device. As will be appreciated, voice feedback data may be particularly useful where a digital media player has limited or no display capabilities, or if the device is being used by a disabled person (e.g., visually impaired).
When outputting voice feedback and media concurrently (e.g., mixing), it is generally preferable to “duck” the primary audio file such that the volume of the primary audio file is temporarily reduced during a concurrent playback period in which the voice feedback data is mixed into the audio stream. The desired result from ducking the primary audio stream is typically that the audibility the voice feedback data is improved from the viewpoint of a listener.
Known ducking techniques may rely upon hard-coded values for controlling the loudness of primary audio files during periods in which voice feedback data is being played simultaneously. However, these techniques generally do not take in account intrinsic factors of the audio files, such as genre or loudness information. For instance, where a primary audio file is extremely loud or constitutes speech-based data (e.g., an audiobook), ducking the primary audio file based on a hard-coded or preset ducking value may not always be sufficient to provide an aesthetically pleasing composite output stream. For example, if the primary media is ducked too little, the combined gain of the composite audio stream (e.g., with the simultaneous voice feedback) may exceed the power output threshold of an associated output device (e.g., speaker, headphone, etc.). This may result in clipping and/or distortion of the combined audio output signal, thus negatively impacting the user experience. Further, if the primary audio file is already very “soft” (e.g., having a low loudness), then additional ducking of the primary audio file may cause a user to perceive the secondary voice feedback data as being “too loud.” Accordingly, there are continuing efforts to further improve the user experience with respect to digital media player devices.