In the multimedia application area, a variety of new mobile devices, such as Pocket PC, Smartphone, SPOT watch, Tablet PC, personal digital assistant devices, etc, are growing popular in people's daily life. These devices are becoming more and more powerful in both numerical computing and data storage. Moreover, people have become enthusiastic to watch videos through these mobile devices.
However, low bandwidth connection and small display are still two serious obstacles that have undermined the usefulness of these devices in people's daily life. Though a few commercial video players such as Windows Media Player and PocketTV have been developed to enable users to browse videos from small-form factor devices, the limited bandwidth and small window size remain to be two critical obstacles. With the rapid and successful development of 2.5G and 3G wireless networks, the bandwidth factor is expected to be less constraint in the near future. While at the same time the limitation on display size is likely to remain unchanged for a certain period of time.
There has been some existing work focusing on the topic of displaying images on mobile devices. They can calculate and provide an optimal image viewing path based on the image attention model to simulate the human viewing behaviours. Since most of the valuable information is presented by videos, improving the experience of video viewing on small displays is very important to unleash the power of these mobile devices.
One solution to provide a better user experience for viewing videos on limited and heterogeneous screen size displays has been proposed by X. Fan et al in “Looking into Video Frames on Small Displays”, ACM MM'03, 2003, which introduces three browsing methods: manual browsing method, full-automatic browsing method and semi-automatic browsing method.
However, in the proposed full-automatic browsing method, both direction and zoom controls are disabled. The resulting video stream uses more screen space to display the attention-getting regions while cropping out the other parts. Therefore this approach will have less difference with the conventional down-sampling scheme when video frames contain many separate focuses.
In the semi-automatic browsing method, human interaction is still required to switch the browsing focus when there is more than one important attention object (AO). The display focus is calculated after the user presses the control button, and the artefact will appear when the focus is changed.
Therefore, the existing schemes couldn't provide a good solution for automatically browsing videos on devices with small display size and keep a better tradeoff between video display quality and display size constraint, especially in multiple focuses cases.