Accompanying the expansion of computer and other electronic technology is a continued growth in the types and variety of available entertainment. Music is no exception. Listeners can now receive music from a multitude of sources, e.g., compact disks (CD) and other digital recording media, audio streaming via the internet, cable channels dedicated to audio programming, satellite radio, etc. Despite this plethora of music sources, however, conventional radio broadcasting (e.g., AM, FM, shortwave) continues to be an important source of music for many people.
Although radio broadcasting still offers many advantages over other sources of musical programming, it also has disadvantages. One longstanding problem relates to the inclusion of non-musical programming in a radio broadcast. In particular, most radio broadcasting (at least by stations which play music) is a mixture of music, speech (e.g., announcements, news broadcasts, advertisements, etc.) and “jingles” (short sound tracks with musical qualities, such as may be used in a commercial advertisement). Many users find the non-musical programming to be distracting and/or annoying.
One solution to this problem is to record broadcasts without the non-music portions. However, many persons do not have the time to manually perform this recording, i.e., to manually start recording a broadcast at the beginning of a song and then stop recording when the song ends. An automatic way of recording broadcast music is therefore desired. Unfortunately, the wide variety of music types (having a wide variety of sound qualities), as well as the unpredictable ways in which music and non-music are combined in broadcast programming, makes this a difficult task.
FIGS. 1A and 1B show examples of this problem. In some cases, as shown in FIG. 1A, one music track may fade out toward its end, be followed by non-music (announcement, advertisement, etc.), after which another music track fades in. FIG. 1B shows another common scenario. In particular, a disc jockey (DJ) may speak over a song before the song ends, the song may then fade out as another song fades in, and the DJ may speak over the beginning of the next song. The problem can be compounded in many other ways: background music may be added to DJ or other announcements; a DJ may speak in the middle of a track; jingles (which have musical sound qualities) are included in advertisements and other non-music programming; some music contains speech and unconventional sound effects; etc.
There have been various prior efforts to automatically classify an audio or video stream (i.e., to automatically discriminate between different types of content within the stream), including speech-music discrimination. Although there are similarities in the algorithms and methods employed in some of these prior efforts, minor differences in the methods can have very significant effects. In some cases, a very small and unapparent change in an algorithm can make the difference between success or failure in a particular application. Many of these prior efforts also employ very complex algorithms requiring substantial processing. In light of these and other challenges, there remains a need for different implementations of systems and methods for discriminating between music and non-music portions of an audio broadcast.