With the advent and huge popularity of digital video cameras and internet multimedia it is common for people to store a large number of audio-visual materials on personal computers and other computer related devices. There is a need for users of these devices to be able to access and navigate through their documents to view items and to visually search for items.
Modern computing systems often provide a variety of methods for viewing large collections of documents which can be controlled by computer interface control devices including a mouse and pointer, and also by keyboard input, or other physical controls such as scroll wheels, as found on some mouse devices. The methods generally provide a means to select a location within a storage structure and return the set of items within that location or to return a set of items matching a certain query. A viewing area is then used to display representations of items from the set, typically in a sequence. For large sets it is common that only a limited number of the items in the set can be viewed in the viewing area at any one time. The user can use the mouse and pointer or other input control devices to execute commands which move items through the viewing area so that items earlier or later in the sequence are displayed. The action of visually moving items past a display area is herein referred to as “scrolling”. The action of controlling scrolling for the purpose of exploring a set is herein referred to as “browsing”. These terms are widely known in the art according to these general definitions.
Audiovisual items represent a class of digital content which include audio content, visual content or a combination of both. The visual content may be static (single) image or a sequence of image frames that are typically but not always accompanied by some audio content. This latter type of content is generally referred to as video content and can include a sequence of images of real-life scene(s), animation, or a combination of the two. The scene need not be contiguous and may be formed into a single reproducible sequence through the editing or splicing together of a number of discrete sequences or “clips”, as they are known in the art. The video content may be stored or recorded in a number of formats each able to be reproduced with an appropriate decoder.
Some browsing systems display such audiovisual items as static images, each representing the first or a later frame in the corresponding visual sequence. This has the disadvantage that it is not always possible to reliably identify items based on the displayed first single frame. Furthermore, where the content type in any one collection is mixed, it may not be possible to distinguish audiovisual items which have a moving video sequence from those items which are a single frame such as a photograph. Some systems automatically play back all the frames in the sequence in real time or transition between different frames in the sequence. This has the disadvantage that when many items are shown, the viewer is often drawn to look at items which change appearance when they wish to remain focused on a sequence from a single item.