Content coherence metric is used to measure content consistency within audio signals or between audio signals. This metric involves computing content coherence (content similarity or content consistency) between two audio segments, and serves as a basis to judge if the segments belong to the same semantic cluster or if there is a real boundary between these two segments.
Methods of measuring content coherence between two long windows have been proposed. According to the method, each long window is divided into multiple short audio segments (audio elements), and the content coherence metric is obtained by computing the semantic affinity between all pairs of segments and drawn from the left and right window based on the general idea of overlapping similarity links. The semantic affinity can be computed by measuring content similarity between the segments or by their corresponding audio element classes. (For example, see L. Lu and A. Hanjalic. “Text-Like Segmentation of General Audio for Content-Based Retrieval,” IEEE Trans. on Multimedia, vol. 11, no. 4, 658-669, 2009, which is herein incorporated by reference for all purposes).
The content similarity may be computed based on a feature comparison between two audio segments. Various metrics such as Kullback-Leibler Divergence (KLD) have been proposed to measure the content similarity between two audio segments.
The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section. Similarly, issues identified with respect to one or more approaches should not assume to have been recognized in any prior art on the basis of this section, unless otherwise indicated.