The term “computer system” today applies to a wide variety of devices. The term includes mainframe and personal computers, as well as battery-powered computer systems, such as personal digital assistants and cellular telephones. In computer systems, a graphics controller is commonly employed to couple a CPU to a display device, such as a CRT or an LCD. The graphics controller performs certain special purpose functions related to processing image data for display so that the CPU is not required to perform such functions. For example, the graphics controller may include circuitry for decompressing image data as well as an embedded memory for storing it.
In a display device, an image is formed from an array, often referred to as a frame, of small discrete elements known as “pixels.” Display devices receive image data arranged in raster sequence and render it in a viewable form. A raster sequence begins with the left-most pixel on the top line of an array of pixels, proceeds pixel-by-pixel from left to right, and when the end of the top line is reached proceeds to the second line, again beginning with the left-most pixel. The sequence continues in this manner to each successively lower line until the end of the last line is reached.
The term pixel has another meaning; it also refers to the datum used to define a displayed pixel's attributes, such as its brightness and color. For example, in a digital color image, pixels are commonly comprised of 8-bit component triplets, which together form a 24-bit word that defines the pixel in terms of a particular color model. A color model is a method for specifying individual colors within a specific gamut of colors and is defined in terms of a three-dimensional Cartesian coordinate system (x, y, z). The RGB model is commonly used to define the gamut of colors that can be displayed on an LCD or CRT. In the RGB model, each primary color—red, green, and blue—represents an axis, and particular values along each axis are added together to produce the desired color. Each color is represented by 8 of the 24 bits. Similarly, pixels in display devices have three elements, each for producing one primary color, and particular values for each component are combined to produce a displayed pixel having the desired color. In a digital gray scale image, a single 8-bit component defines each pixel.
Image data requires considerable storage and transmission capacity. For example, consider a single 512×512 color image comprised of 24-bit pixels. The image requires 768 K bytes of memory and, at a transmission rate of 128 K bits/second, 48 seconds for transmission. While it is true that memory has become relatively inexpensive and high data transmission rates more common, the demand for image storage capacity and transmission bandwidth continues to grow apace. Further, larger memories and faster processors increase energy demands on the limited resources of battery-powered computer systems. One solution to this problem is to compress the image data before storing or transmitting it. The Joint Photographic Experts Group (JPEG) has developed a popular method for compressing still images. Using a JPEG coding method, a 512×512 color image can be compressed into a JPEG file that may be only 40-80 K bytes in size (depending on the compression rate and the visual properties of the particular image) without creating visible defects in the image when it is displayed.
Before JPEG coding of a color image, the pixels are commonly converted from the RGB color model to a YUV model. In addition, a color source image is separated into component images, that is, Y, U, and V images. (Of course, this step is not necessary if the source image is a gray-scale image as these images have only one component.)
The JPEG standard employs a forward discrete cosine transform (DCT) as one step in the compression (or coding) process and an inverse DCT as part of the decoding process. In an image, pixels and their components are distributed at equally spaced intervals. Just as an audio signal may be sampled at equally spaced time intervals and represented in a graph of amplitude versus time, pixel components may be viewed as samples of a visual signal, such as brightness, and plotted in a graph of amplitude versus distance. The audio signal has a time frequency. The visual signal has a spatial frequency. Moreover, just as the audio signal can be mapped from the time domain to the frequency domain using a Fourier transform, the visual signal may be mapped from the spatial domain to the frequency domain using the forward DCT. The human auditory system is often unable to perceive certain frequency components of an audio signal. Similarly, the human visual system is frequently unable to perceive certain frequency components of a visual signal. JPEG coding recognizes that the data needed to represent unperceivable components may be discarded allowing the quantity of data to be reduced.
According to the JPEG standard, the smallest group of data units coded in the DCT is a minimum coded unit (MCU), which comprises three or more blocks for a YUV image and one block for a gray-scale image. A “block” is an 8×8 array of “samples”, a sample being one element in the two-dimensional array that describes a component. (In this specification, the terms “sample”, “pixel component”, or simply “component” have the same meaning.) Every sample in each component image may be selected for JPEG compression. In this case, the MCU for a YUV image comprises three blocks, one for each component. Commonly, however, a subset of the samples in the U and V blocks are selected for compression. This step is often referred to as chroma sub-sampling. For instance, only 50% or 25% of the samples in the U and V components may be selected (chroma sub-sampled) for compression. In these cases, the MCU comprises four blocks and six blocks, respectively. The phrase “sampling format” is used to distinguish among the various types of chroma sub-sampling. Typical sampling formats are 4:4:4, 4:2:2, 4:2:0, and 4:1:1, which are further described below. The blocks for each MCU are grouped together in an ordered sequence, e.g. Y0U0V0 or Y0 Y1U0V0, where the subscript denotes the block. The MCUs are arranged in an alternating or interleaved sequence before being compressed, and this data format is referred to as “block-interleaved.”
When a JPEG file is received, it is normally decoded by a special purpose block of logic known as a CODEC (compressor/decompressor). The output from the decoding process is block-interleaved image data. As the CODEC is adapted to work in many different computer systems, it is not designed to output image data in any format other than the block-interleaved format. Display devices, however, are not adapted to receive block-interleaved image data; rather display devices expect pixels arranged in raster sequence. Moreover, operations performed by the graphics controller before the pixels are provided to the display, such as resizing and color space conversion, are adapted to be performed on raster-ordered pixels.
In order that the image data can be operated on and provided to the display as raster ordered pixels, the output of the CODEC, that is, the block-interleaved image data, is stored as blocks in a memory commonly referred to as a line buffer. As the image data for any particular pixel is needed, three samples are fetched from respective component blocks that are stored in scattered locations in the line buffer. The samples are assembled into pixels, processed, and stored in raster sequence in a memory, usually referred to as a display or frame buffer. Pixels are then sequentially fetched from the frame buffer and provided to the display device.
The line buffer must be large enough to hold at least one line of pixels of the image. The reason is that the graphics controller is designed to operate on raster ordered data. Moreover, the line buffer generally must be large enough to hold at least two display lines. This is because one line is read from the line buffer while another is being stored by the CODEC in a “ping-pong” scheme. Because JPEG decoded block-interleaved image data is made up of 8×8 blocks of samples, it is not possible to simply store a single line. Instead, all of the blocks needed to assemble a line must be stored. This number of blocks sufficient to assemble a line is the same number as the number of blocks sufficient to store 8 lines. In other words, to store one line, the line buffer must be large enough to hold 8 lines. And to alternately store one line while another is being read in ping-pong fashion, the line buffer must be large enough to store 16 lines.
Because the line buffer must be able to hold at least 16 lines of image data, it requires a considerable amount of memory. Further, the size of a line buffer embedded in a graphics controller is predetermined when the integrated circuit (IC) is designed. Because the maximum width of a source image that can be JPEG decoded is limited by the size of the line buffer, the only way to provide the flexibility for handling source images of varying sizes is to provide a line buffer that is large enough to hold the largest expected image width.
Memory is expensive in terms of the physical space and the power it requires. For these reasons, it would be desirable to reduce the amount of memory required for the line buffer. Moreover, it would be desirable to provide the flexibility to decompress JPEG coded images of various sizes without having to create an IC in which the line buffer is large enough to accommodate the largest expected image width even though narrower images are commonly processed. Accordingly, there is a need for a method and apparatus for transforming the dimensions of an image represented by block-interleaved data.