Computer-readable representations of real-world source images are generated in a number of different ways, such as by using a scanner or digital camera. These devices optically scan the real image and produce a sampled video signal. Each sample of the video signal is a digital value, called a pixel value. The value of a pixel corresponds to the light intensity, or some colorimetric property, of a particular point in the real image. The sampled video signal is then typically read by a computer, which then organizes the pixels into a two-dimensional array called a pixel map. Each pixel value in the pixel map thus represents the intensity of a corresponding elemental area of the real image. The array coordinates at which a pixel value is stored are determined by the spatial position of its corresponding elemental area in the source image.
The resolution of the resulting computer-readable image representation depends upon the number of pixel map entries, as well as the number of different levels used in digitizing the sampled video signal. For example, even if a fairly small 3 by 5 inch photograph is scanned at what is considered to be a medium resolution of 300 pixels per inch, the pixel map is 900 by 1500, or 1.35 million pixel values. If each sample of the video signal is digitized to one of 256 possible levels, requiring eight bits per pixel, the pixel map occupies 1.35 megabytes of memory.
While a data file of that size is not unmanageable with present-day computer technology, it is too large for most applications, which typically require the storage and handling of a number of images. It is clear, therefore, that a more efficient mechanism for handling scanned images is desirable.
It is also desirable to provide a mechanism whereby an image sampled at one resolution can be subjected to various manipulations, and then be displayed or printed at a different resolution. When this is possible, an image can be rendered on output devices having a range of resolutions, regardless of the resolution of the original pixel map. Pixel map representations are usually not easily processed in this manner, since only a discrete number of samples of the source image are available.
In other words, sampled images should be represented in a condensed form, and yet be readily amenable to scaling, rotating, clipping, windowing, and other manipulations which are commonly performed on synthetic computer-originated images.
Certain images, such as type fonts, can already be represented in raster-image processable form. Such font images are typically susceptible to specification by a human as analytic descriptions of the outlines of each character. The analytic descriptions can be a series of lines, arcs, or other graphic primitive elements. Then, when a character needs to be rendered, the analytic expressions can be easily subjected to the desired graphic manipulations.
Others have proposed various ways to model a graphic object, given an analytic description of the image as an input. Such models are usually in the form of a linked list with pointers to successively smaller portions of the object. A computer builds the model by evaluating the analytic expression for one portion of the image at a time. For example, if the object is a sphere, a mathematical representation for the sphere is evaluated in a number of particular spatial ranges, to determine how each portion can be more simply described, say, as a line segment. In order to render the sphere, the linked list is then traversed by evaluating each line segment.
However, conventional wisdom has been that such hierarchal representations cannot easily be derived for sampled images, since there is no pre-existing human-specified analytic description of the image available. And even if such a description can be derived, it is also thought that the overhead of the hierarchal model will be far too large to justify its use.