A television display device equipped with significant associated data processing capability, often called a “Smart TV,” can be configured to optimize a viewer's experience through the provision of contextually-relevant material. One example of this might include offering additional background information or special messages associated with the programming or commercial material being displayed at that moment. To accomplish this goal, the processing means within the TV device itself, or an associated device such as a set-top box needs to have real time “awareness” of what programming is being displayed on the TV screen at that moment.
There are currently two primary forms of automatic content recognition in use for enhanced TV experiences. One method is digital watermarking, and the other is content fingerprinting. The digital watermark method requires the broadcast content to be preprocessed so that the watermark data may be hidden within the content signal. That data is then detected by the TV processing means in order to enable identification and synchronization.
Another method of automated content recognition involves using audio or video content fingerprinting to identify the content signal as it is displayed by computing a sequence of content fingerprints and matching them with a database. While it does away with the need to have all content pre-processed, it is more challenging to identify audio or video programming with content fingerprinting, particularly if the system is intended to operate simultaneously across potentially hundreds of television program channels while dealing with a variety of user behavior such as channel changing, pausing or viewing time-shifted content that causes a loss of identification.
Therefore, for video content fingerprint-based ACR, a person skilled in the art might implement a solution requiring computational processing power, both at a centralized server or other computing means as well as within each local display device that is too costly to be commercially reasonable and practical. For example, the ACR system might be programmed to operate continuously on a video frame-by-frame basis in order to identify the program and track the relative time location within each show across a wide time range and for a large database of shows, while simultaneously needing to account for time-shifted content, channel surfing, or the like.
At most any point in time in the US market, there are several hundred program choices offered by most cable TV or satellite providers. In addition, there are over one hundred major television market areas with dozens of local television channels. On a national level, an ACR system must monitor thousands of unique television programs. The need for computational efficiency is clearly required in order for a system to operate reliably and at a commercially reasonable cost.
Despite continuing advances in computing power, automated matching of audio or video content remains a daunting task. Such “brute force” identification implies that the TV display device's processing means is continuously computing fingerprints and sending these fingerprints and other associated content signals to centralized fingerprint database for identification. Such as process would use an excessive amount of the computing resources of a typical smart TV; leaving little else for the other smart TV applications that a user may wish to utilize.
The challenge is even greater on a system level than at the local TV set because the centralized system must have sufficient computing power to handle simultaneous processing demands from potentially millions of TV sets. As noted in the detailed description below, the costs of memory and processing power needed to support each ACR system in the field can soon overwhelm the revenue being generated by each such system. Therefore a method to optimize these systems to make them commercially viable is a still unmet need for the operators of such systems. In addition, it has been found that improvements in the efficiency of these systems, also improves the accuracy of the content matching process.