Nowadays, powerful search engines dominate the ways people acquire information, and are becoming more and more popular in searching for rich content. Compared with searching for plain text, a retrieval of rich content media requires new technologies to describe, organize, and manage data in various types of formats. Automatic Content Retrieval (ACR) has been proven a very effective method to search for rich content. Existing ACR systems are effective in certain situations and many value-added services have been derived from the ACR systems.
Thus, it is a promising business to allow users to enjoy value-added services through using their mobile devices, such as smartphones, tables, or even smart watches, to retrieve abundant information about the content sources. An emerging mobile search-ready technology may enable users to use their mobile devices to secure value-added services based on the retrieval of media information, such as pictures on posters, videos on public bulletins, and audios in media players, etc.
Mobile devices may retrieve media information from a smart terminal. For example, a smart display, such as a public bulletin display or a home TV display, usually faces multiple users in front. Considering a scenario where the smart display is interfacing with multiple mobile devices and the number of displays may vary all the time, a 2-way communication channel may not be efficient, and a 1-way communication channel from the smart display to the mobile devices, such as a 1-way broadcasting channel, may be more appropriate. Then, the mobile devices may receive signals from the smart display and then use the signals for mobile search to obtain the value-added services.
However, according to the present disclosure, there are some concerns in the 1-way communication from the smart display to the mobile devices. For example, the mobile devices may listen to the sounds of the smart display and use ACR for the audio based retrieval, but this approach does not work very well for multiple users if a surrounding noise level, e.g., chatting, music and etc., is above a certain threshold. In addition, a user may take a photo of the smart display screen for ACR, and the ACR results are affected by noises such as light reflections, color changes, etc. Thus, video based retrieval or audio-video based retrieval may achieve a better result.
However, in a video retrieval application, efficiently utilizing a transmission channel capacity and finding a balance between an error resilience and a transmission time may be highly desired. The disclosed methods and systems are directed to solve one or more problems set forth above and other problems.