In any software solution that deals with rendering closed captions or subtitles on a television (TV) screen along with the video content, a verification of closed captions or subtitles being rendered correctly is required. For closed captions this means verifying that the closed captions have the correct content, duration, styles (e.g., font, foreground color, background color, etc.), and language. Validation that the rendered closed captions meet the standards specification is also required. Due to the complicated nature of this problem, this verification is typically performed manually by a software tester visually inspecting the output on the screen. This type of manual testing needs to be done after every update to the software solution in order to ensure that the closed captions rendering is still functioning correctly. This software testing can be a tedious, labor intensive task leaving a large footprint for potential software bugs which necessitates an automation framework.
Conventional solutions that deal with this problem use speech to text technology to match the spoken word with the closed captioning from the metadata manifest file included as part of the video source. Such a solution, however, can only validate the content and not style of the closed caption. The speech to text solution also does not test closed captioning rendering. If the software has bugs in rendering the source file, they will not be identified since the speech to text solution is not an end-to-end black box solution. Speech to text does not have a high accuracy if the video content is noisy and the accuracy drastically reduces when using languages other than English.