A set-top box (STB) also known as a digibox or set-top unit (STU) is a device that connects to a television and an external source of signal, turning the signal into content which may then be delivered as an Audiovisual (A/V) signal for display on the television screen or other A/V device. Most frequently, the external source of signal is provided by a satellite or cable connection.
As with other consumer products, both manufacturers and suppliers are keen to ensure that products operate correctly and as specified. Initially and to this day, a significant part of the testing is performed manually, whereby a tester issues a command to the STB which may be via the user interface on the STB itself or via a remote control device as illustrated in FIG. 1, and observes the response of the STB on a TV display. As is shown in FIG. 1, a typical STB 10 has a number of signal inputs including an RF signal 14, which may for example be from a satellite or cable connection. An A/V signal 16 may also be provided as input allowing the set-top box to feed through a signal to a television from a VCR, DVD, Blu-ray disc, Media Juke Box or other similar device. The output to the television is an A/V signal 18 which may be provided over a variety of standard interfaces including SCART and HDMI. To allow the user control the operation of the STB, a number of buttons and similar controls may be provided on the STB itself. Additionally and more commonly employed by users, the STB may have an infra red (IR) or wireless remote control input configured to operate with a remote control device 12.
As manual testing can be time consuming, prone to error and in some instances lacking accuracy, an effort has been made to automate some of these tests. In respect of these automated tests, it will be appreciated that this testing is typically performed on the end product by users without necessarily any detailed knowledge or access to the internal circuitry of the STB. Accordingly, the testing of STB's is generally performed on a “black-box” basis, where only the inputs and outputs are available for modification and examination. Accordingly test methods and systems have been developed specifically for set top boxes, an example of such a system is StormTest™ provided by the S3 Group of Dublin, Ireland which may be employed to test STBs, televisions, and similar devices such as digital media players. An arrangement for a STB test system is shown in FIG. 2. The STB test system 20 comprises a controller 28 which manages the test function and interacts with the other features of the STB test system. The other features include an output interface for controlling a remote control device 12, allowing commands to be sent to the STB and an input interface for receiving the video and/or audio signals from the STB 10. This input interface may include an audio capture device 24 for accepting the audio as a test signal and/or a frame grabber 22 or similar device for accepting the video frames from the STB. The data captured is then made available to a processor for analysis, which in turn produces a test result and provides this to a user. Thus during a typical test, the Test System issues a command or sequence of commands to the STB, suitably via the remote control interface. Each frame of video and/or the audio is captured and made to available to the Test System for analysis of the STB response to the commands issued to it.
Typically, tests might include the generation of a “change channel” command to the STB followed by analysis of the audio and/or video outputs to ensure that the channel change command was correctly received and executed by the STB.
Set-top boxes are complex devices containing a main powerful embedded CPU and various peripheral devices. They generally run a sophisticated operating system (e.g. Linux, or VxWorks) and perform complex functions, and run a large and complex software stack on that operating system. These devices generally present to the users a sophisticated graphical user interface with on-screen menus and graphics. Automation of the testing typically involves, amongst other areas, optical character recognition (OCR) to read text from the screen and compare it with known or expected value, to determine if the user interface presented to the user is as expected.
As shown in FIG. 2, OCR may be performed on a captured frame or a section of a frame. Typically, in configuring a test routine, the person configuring the test will identify the section of the frame where text is to be found. The person may also configure the test by defining the expected result from the OCR process. The expected result may be a pre-determined known value or it may obtained elsewhere in the test routine. Additionally, the OCR process may be employed to obtain and store information from the device under test. OCR is generally performed by a software component known as an “OCR Engine”. There are several OCR engines (software product) generally available for this purpose, i.e. the engines can process images and extract text from them. However, there are several practical difficulties in using an OCR engine in the context of a frame captured from a STB or device with a display, e.g. a digital television. One of the reasons for this is that OCR engines conventionally come from the document processing area, which has several differences to images captured from a video stream. The resolution of document scanners used to capture document images is generally much greater than the resolution of images, even high definition images, captured from video.
Documents generally have a fixed foreground/background contrast pattern: generally dark text on a light background. In addition, the contrast for a given document page is not variable. Once the OCR engine has determined the foreground and background colours for a page, this will not change. Thus for example it is known to use filters to optimise an OCR engine, for example where the page colour is not white. However contrast on modern dynamic user interfaces can be highly variable. As an example, transparent panes in the user interface are commonplace and thus the contrast of the text will vary as the background television program changes. To make matters worse, images captured from video streams can be quite noisy. As a result, the present inventor's experience has been that the accuracy of a typical OCR engine when employed with captured video may only be in the region of 65-90%.
It would be beneficial to improve the performance of OCR engines in analysing video images.