As industries move toward multimedia rich working environments, usage of all forms of audio and visual content representations (radio broadcast transmissions, streaming video, audio canvas, visual summarization, etc.) becomes more frequent. Whether a user, content provider, or both, everybody searches for ways to optimally utilize such content. For example, one method that has much potential for creative uses is content identification. Enabling a user to identify content that the user is listening to or watching offers a content provider new possibilities for success.
Content identification may be used in a service provided for a consumer device (e.g., a cell phone), which includes a broadcast receiver, to supply broadcast program metadata to a user. For example, title, artist, and album information can be provided to the user on the device for broadcast programs as the programs are being played on the device. Existing systems to provide content information of a broadcast signal to a user may only provide limited metadata, as with a radio data signal (RDS). In addition, existing systems may not be monitoring every broadcast station in every locale, and a desired radio station mapping may not always be available.
Still further, other existing systems may require the consumer device to sample/record a broadcast program and to send the sample of the broadcast program to a recognition server for direct identification. A computational cost to perform a recognition on one media sample may be small, however, when considering that potentially many millions of consumer devices may be active at the same time, and if each were to query the server once per minute, the recognition server would have to be able to perform millions of recognitions every minute, and then the computational cost becomes significant. Such a system may only be able to allow a time budget of a few microseconds or less per recognition request, which is a few orders of magnitude smaller than typical processing times for media content identification. Furthermore, since broadcast media is often presented as a continuous stream without segmentation markers, in order to provide matching program metadata that is timely and synchronized with current program, a brute-force sample and query method could require fine granularity sampling intervals, thus increasing required query load even more.
In the field of broadcast monitoring and subsequent content identification, it is desirable to identify as much audio content as possible, within every locale, while minimizing effort expended. The present application provides techniques for doing so.