Multimedia information streams such as streaming audio, video, and text are commonplace with the proliferation of information disseminated and available over information networks such as the Internet, telephone, cable TV, and wireless mediums. Massive amounts of multimedia data are transmitted over such networks, in the form of a digital stream, analog video, or text captioning, for example. Often, repetitions or near-repetitions of such data occur in these streams. Repetitions include transmissions such as paid advertisements, theme music at the commencement of a TV broadcast, and common jingles and slogans that may accompany transmissions from a common source.
Large amounts of multimedia data may be gathered by applications which store and process such data, such as SpeechBot™ and Mediaworqs™, for example. Repetitive transmissions can consume storage and computation resources redundantly if not detected. Also, processing of transmitted information, such as tracking paid advertisements to ensure frequency and duration, is typically performed by manually observing such multimedia transmissions. Detection and elimination or processing of repetitions can conserve resources, aid in tracking transmission patterns, and serve as building blocks for further processing. Accordingly, it would be beneficial to monitor and detect repetitions in a multimedia information stream to allow selective processing according to a specific application. One prior art technique for exact match audio detection is disclosed in Johnson, et al., “A Method for Direct Audio Search with Applications to Indexing and Retrieval,” IEEE International Conference on Audio, Speech and Signal Processing (ICASSP 2000), Jun. 5–9, 2000. Johnson, however, discloses a system which looks to a single vector derived from a portion of audio in relation to another single vector.