The present invention relates generally to methods and apparatuses for automatically identifying media, or content, samples, and more particularly to a method and apparatus for automatically identifying a media, or content, sample based on a database of known media files by comparing certain aspects of the media sample to similarly obtained aspects of the known media files.
The related applications disclosed various methods and apparatuses for identifying media samples, and applications for such identification. At the heart of these methods and apparatuses is a database of known media files. Creating the database is an expensive proposition. Buying a single copy of all known media files and all new ones as they are created while effective is probably cost prohibitive. Simply making copies of media files while also effective may in certain instances violate copyright laws in some countries. Moreover, the uncertainty of whether certain acts do in fact violate copyright laws coupled with the fact that copyright laws vary, sometimes significantly, from country to country, makes it difficult to invest in and/or implement a system or method that relies upon use of unlicensed media.
The present invention is therefore directed to the problem of developing a method and apparatus for automatically creating a database of known media files at low cost and without violating any copyright laws.