1. Field of the Invention
The present invention relates to music analysis and particularly to a method for tempo estimation, beat detection and micro-change detection for music, which yields indices for alignment of soundtracks with video clips in an automated video editing system.
2. Description of the Related Art
Automatic extraction of rhythmic pulse from musical excerpts has been a topic of active research in recent years. Also called beat-tracking and foot-tapping, the goal is to construct a computational algorithm capable of extracting a symbolic representation which corresponds to the phenomenal experience of “beat” or “pulse” in a human listener.
The experience of rhythm involves movement, regularity, grouping, and yet accentuation and differentiation. There is no “ground truth” for rhythm to be found in simple measurements of an acoustic signal.
As contrasted with “rhythm” in general, “beat” and “pulse” correspond only to “the sense of equally spaced temporal units.”
It is important to note that there is no simple relationship between polyphonic complexity—the number and timbres of notes played at a single time—in a piece of music, and its rhythmic complexity or pulse complexity. There are pieces and styles of music which are texturally and timbrally complex, but have straightforward, perceptually simple rhythms; and there also exist musics which deal in less complex textures but are more difficult to rhythmically understand and describe.
The former sorts of musical pieces, as contrasted with the latter sorts, have a “strong beat”. For these kinds of music, the rhythmic response of listeners is simple, immediate, and unambiguous, and every listener will agree on the rhythmic content.
In Automated Video Editing (AVE) systems, music analysis process is essential to acquire indices for alignment of soundtracks with video clips. In most pop music videos, video/image shot transitions usually occur at the beats. Moreover, fast music is usually aligned with many short video clips and fast transitions, while slow music is usually aligned with long video clips and slow transitions. Therefore, tempo estimation and beat detection are two major and essential processes in an AVE system. In addition to beat and tempo, another important information essential to the AVE system is micro-changes, which is locally significant changes in a music, especially for music without drums or difficult to accurately detect beats and estimate tempo.