Thumbnails are an extremely desirable graphical user interface component for most multimedia and document applications. A thumbnail is a resized smaller version of an image representative of the full image which can be displayed in some applications by clicking on the thumbnail. The resizing is typically done by traditional smoothing followed by downsampling. In most traditional applications, such as listings on a web page, the size of a thumbnail is fixed. A common problem with those thumbnail displays is that the image information is often not recognizable for the viewer and does not provide the desired usefulness.
Newer multimedia communication tools allow a free-format composition of images of various sources on a representative canvas. In this case, the size of a thumbnail is allowed to be variable. In such an application, besides the question of what to display in a fixed-size thumbnail, the additional question arises of what constitutes a suitable thumbnail size or shape for a given image. For example, in representing a photo of a person as captured from a visitor's kiosk, a downsampled image of the face-portion of the photo would be sufficient for a thumbnail, whereas for a web document the title at full resolution might be a useful display.
The authors in Mohan, R., Smith, J. R., and Li, C.-S., “Adapting Multimedia Internet Content for Universal Access,” IEEE Trans. Multimedia, Vol. 1, no. 1, pp. 104–114, 1999, describe a method for the transcoding of images to specific devices such as monitors or cell phones. In Lee, K., Chang, H. S., Choi, H., and Sull, S., “Perception-based image transcoding for universal multimedia access,” which appeared in ICIP 2001, Proceedings of International Conference on Image Processing, Thessalonihi, Greece, 2001. This approach is extended to sending only specifically selected parts of an image at a specific resolution. The specification is performed by the sender and does not happen automatically.
There exist several software packages providing for thumbnail creation. These software packages focus on speed, but all of these resize the entire image to user or application defined sizes using traditional downsampling. Thus, image information is often not recognizable to the viewer.
In Woodruff, A., Faulring, A., Rosenholtz, R., Morrison, J., and Pirolli, P., “Using Thumbnails to Search the Web,” Proceedings of SIGCHI'01, Seattle, April 2001, enhanced thumbnails are introduced to provide a better representation of documents. The enhancement consists of lowering the contrast in traditionally created thumbnails and superimposing keywords in larger fonts that were detected via an Optical Character Recognition (OCR) system. The result is a limited improvement at best and is only applicable to images that contain text.
The authors in Burton, C. A., Johnston, L. J., and Sonenberg, E. A., “Case study: an empirical investigation of thumbnail image recognition”, Proceedings of Visualization Conference 1995, found that the filtering of images (contrast enhancement, edge enhancement) before downsampling increases a viewer's ability to recognize thumbnails. Even so, image information is often not recognizable to the viewer.
For creation of video summaries there exist methods that display groupings of video frames where the individual frames have specific sizes/resolution, as in Uchihashi, S., Foote, J., Girgensohn, A., and Boreczky, J., “Video Manga: Generating Semantically Meaningful Video Summaries,” Proceedings of Seventh ACM International Multimedia Conference, Orlando 1999. The decision for resizing the frames is made by measuring the frames importance in the video sequence, not by the actual image content.
The area of page layout has been discussed in many papers. Typically, the authors assume, disadvantageously, that the image content of documents is mostly text with perhaps some small images or graphics, and perform text specific operations involving clustering techniques to determine connected components. A common method for page layout is the one described in O'Gorman, L., “The Document Spectrum for Page Layout Analysis,” IEEE Trans. Image Proc., Vol. 15, no. 11, pp. 1162–1173, 1993.
One existing image file format stores multiple resolutions of an image (e.g., as created by a Laplacian pyramid). As a result, this image file format is usually disadvantageously larger than a file of a wavelet coded image. This image file format has the option of incorporating specific parameters into the file. Those parameters could be, for example, result aspect ratio, rectangle of interest, filtering or contrast adjustment. The parameters are set by the creator of the file and may or may not be used by the receiver.