1. Technical Field
The invention is related to identification of media objects in broadcast media streams, and in particular, to a system and method for providing concurrent server-side identification of media objects, such as songs, in synchronized data streams to large numbers of individual clients while minimizing server database query loading.
2. Related Art
There are many existing schemes for extracting “features” from signals to be used for identification purposes. For example, with respect to a one-dimensional signal such as an audio signal or audio file, audio feature extraction has been used as a necessary step for classification, retrieval, and identification tasks involving media objects in the audio signal. For identification purposes, the extracted features or “traces” are typically compared to a known “fingerprint” for identifying either elements within the audio signal or the entire audio signal. Such well known media object identification schemes are conventionally known as “audio fingerprinting.”
A number of conventional schemes have adapted such audio fingerprinting techniques to provide identification of particular songs in an audio stream, such as a radio or Internet broadcast. For example, a user listening to an audio stream may hear some song for which he or she would like to know the title, artist, album, etc. Conventional audio fingerprinting techniques are then used to extract one or more traces from samples of the song. Typically, these traces are then compared to fingerprints in a database of known music to identify a match, with the results then being provided to the user.
Further, such techniques have also been adapted to a number of conventional services to provide a fee-based song identification or lookup service that is generally based on audio fingerprinting techniques and database comparisons. For example, several song identification services, such as the relatively well known “Shazam” music identification service, operate to identify specific songs for users via a cell phone network. In particular, systems such as that offered by Shazam generally operate by first requiring the user to dial a number on his cell phone and then to hold the phone up to the music for around 15 to 30 seconds. The Shazam service then identifies the music by comparing the music (or traces computed from the music) to a database of known music. The Shazam service then returns a text message to the user with the title, artist, album, etc. of the identified song.
Unfortunately one problem with lookup services of the type described above is that as the number of users accessing the music identification system at any given time increases, the number of database lookup requests per second also increases. This problem is mitigated, for example, in services such as that provided by Shazam, since the user must pay for the telephone call and for the service itself, for each song he wishes to identify, since the effect of charging users in this fashion tends to limit the number of concurrent users of the system, thereby reducing overall server load. Another problem with such a system is that it requires that samples of the full song (limited by the frequency/bandwidth constraints of the telephone service) be transmitted to the server which is then required to compute traces from the transmitted sample of the media stream.
Consequently, as the number of concurrent users becomes increasingly large, the corresponding computational load for computing traces or fingerprints from the incoming music, performing database lookups for identifying those fingerprints and responding to the individual users can quickly overwhelm even relatively large banks of dedicated servers. As a result, such schemes tend to be limited by the assumption that the number of concurrent users will be relatively low. Further, while it is possible to scale up such schemes to provide a sufficient number of servers to handle large numbers of concurrent users, potentially in the tens of millions, the dollar cost for such a system would likely be prohibitive.
Therefore, what is needed is a system and method for providing real-time identification of songs. Further, such a system and method should be capable of efficiently providing song identification services to large numbers of concurrent users while simultaneously minimizing server load and database lookups. Finally, such a system and method should further minimize server load by eliminating the burden of computing traces from samples of the media stream by requiring that task to be performed by each of a plurality of client computers.