Applications that seek to identify video or audio signal content, including those that attempt to detect pirated content conveyed by video and audio signals or that attempt to resynchronize disassociated video and audio signals, typically rely on processes that examine signal content to derive sets of signatures that represent and identify the content. For many of these applications, it is important to obtain a reliable identification of signals even when the content of those signals has been modified either unintentionally or intentionally such that the modified content can still be recognized by a human observer as being substantially the same as the original content. If the perceived difference between the content of an original signal and a modified signal is small, then preferably the identification process can derive signature sets from the original signal and from the modified signal that are very similar to one another. A few processes that may be used to derive signature sets for video and audio signals are disclosed in U.S. provisional patent application No. 60/872,090 entitled “Extracting Features of Video and Audio Signal Content to Provide a Reliable Identification of the Signals” filed Nov. 30, 2006 by Regunathan Radhakrishnan, et al., and in U.S. provisional patent application No. 60/930,905 entitled “Deriving Video Signatures That Are Insensitive to Picture Modification and Frame-Rate Conversion” filed May 17, 2007 by Regunathan Radhakrishnan, et al., the contents of which are incorporated herein by reference.
Applications that attempt to identify the content of some test signal typically obtain a large number of reference signature sets representing a library of reference content, arrange the reference signature sets into some type of data structure, derive test signature sets from the content of the test signal, and then search the data structure to determine whether reference signature sets exist that match the test signature sets. If an acceptable degree of matching exists, the test signal content and the corresponding reference content are likely to share a common origin. If the reference content is original content, then the test signal content is deemed to be a copy of the reference content.
For many video and audio applications, the library referred to above contains an extensive amount of reference content and the data structure includes a very large number of signature sets. A very large amount of storage is needed to record all of the signature sets needed to implement the data structure and an extensive amount of processing resources are required to search the data structure.