The present invention relates to an image processing apparatus for processing hierarchically encoded image data, and a method therefor.
First, the concept of hierarchical encoding associated with the present invention will be described below. The hierarchical encoding has as its object to grasp the outline of an entire image at an early stage, and will be described with reference to a schematic diagram of FIG. 1. A process for reducing an image 601 (a bold rectangle in FIG. 1) input at the encoder side to 1/2 in the vertical and horizontal directions is repeated n times, thereby generating n reduced images in which an image having a minimum size is reduced to 2.sup.-n the size of the original image in the vertical and horizontal directions. The reduction processing adopts a method of suppressing degradation of information, so that information of an entire image can be grasped even for a low-resolution image. For example, a JBIG (Joint Bi-level Image Coding Experts Group) algorithm adopts a so-called PRES method disclosed in U.S. Pat. No. 5,159,468 as reduction processing. In this case, an original image is called a zeroth-layer image (601), and images sequentially obtained in the reduction processing are respectively called a first-layer image (602) and a second-layer image (603).
In hierarchical encoding, encoded data of a layer image having a small size is decoded and displayed. The display operation is performed in the order from a layer image having the lowest resolution. In the display operation, an image multiplied with a proper enlargement magnification is displayed, so that the resolution is gradually increased as information is transmitted. For example, the second-layer image is displayed in a .times.4 enlarged scale in the vertical and horizontal directions, and subsequently, the first-layer image is displayed in a .times.2 enlarged scale in the vertical and horizontal directions. Such a display method is called a progressive display method. In the JBIG algorithm, encoding is achieved by arithmetic codes.
In the JBIG algorithm, stripes obtained by dividing an image in the sub-scanning direction are defined as units of encoding, thus realizing sequential transmission. FIG. 2 is a view showing this concept assuming that the number of layers is 2. In FIG. 2, bold lines represent images. A zeroth-layer image is divided into stripes S.sub.0.0, S.sub.1.0, S.sub.2.0, S.sub.3.0, S.sub.4.0, S.sub.5.0, S.sub.6.0, and S.sub.7.0. Similarly, a first-layer image is divided into stripes S.sub.0.1 to S.sub.7.1, and a second-layer image is divided into stripes S.sub.0.2 to S.sub.7.2.
In the JBIG algorithm, two transmission modes are available in association with stripes, as shown in FIGS. 3 and 4.
In FIG. 3, header information obtained by encoding the size of an original image, the number of layers, the number of scanning lines per stripe, a data format, and the like is followed by encoded data (C.sub.0.2 to C.sub.7.2) obtained by sequentially encoding the stripes (S.sub.0.2 to S.sub.7.2) of the second-layer image. Thereafter, encoded data (C.sub.0.1 to C.sub.7.1) obtained by sequentially encoding the stripes (S.sub.0.1 to S.sub.7.1) of the first-layer image with reference to the second-layer image, and encoded data (C.sub.0.0 to C.sub.7.0) obtained by sequentially encoding the stripes (S.sub.0.0 to S.sub.7.0) of the zeroth-layer image with reference to the first-layer image follow. The format of these encoded data will be referred to as a "plane data format" hereinafter.
In FIG. 4, header information obtained by encoding the size of an original image, the number of layers, the number of scanning lines per stripe, a data format, and the like is followed by encoded data C.sub.0.2 of the stripe S.sub.0.2 of the second-layer image of the stripe S.sub.0.0, encoded data C.sub.0.1 of the stripe S.sub.0.1 of the first-layer image, which data is encoded with reference to the stripe S.sub.0.2 of the second-layer image of the stripe S.sub.0.0, and encoded data C.sub.0.0 of the stripe S.sub.0.0 of the zeroth-layer image, which data is encoded with reference to the first-layer image of the stripe S.sub.0.0. Then, encoded data (C.sub.1.2 to C.sub.1.0) of the stripe S.sub.1.0, . . . , and encoded data (C.sub.7.2 to C.sub.7.0) of the stripe S.sub.7.0 follow in the order of stripes. The format of these encoded data will be referred to as a "stripe data format" hereinafter.
In general, as the number of layers is increased with respect to an original image, the total code length is increased. For example, in the JBIG algorithm, since information as an image is satisfactorily preserved even at a low resolution by the reduction algorithm, the total code length is increased. More specifically, the total code length obtained when the number of layers is 4 with respect to an original image is longer than that obtained when the number of layers is 3.
Similarly, as the number of stripes is increased, the code length is increased. For example, the JBIG algorithm adopts dynamic arithmetic codes, and encoding is performed more properly as the size of an object to be encoded is larger. Therefore, when an image is divided into stripes, encoding is initialized at the beginning of each stripe, and a merit of dynamic encoding is difficult to obtain. For this reason, encoding efficiency is not improved, and the code length is increased. More specifically, the code length obtained when the number of stripes is 4 with respect to an image is longer than that obtained when the number of stripes is 2.
However, in the above-mentioned processing, even when the progressive display mode or the sequential transmission mode is not required, for example, even when data are copied or moved between databases, or when data are sent, communications are performed while the number of layers or stripes remains large, resulting in high communication cost.
In addition, upon sending of data, a transmitting device may wastefully transmit unnecessary layers to a receiving device.
Conventionally, a terminal for receiving sequentially encoded data obtained by encoding pixel values themselves or data obtained by converting the pixel values by MMR coding or MH coding used in a communication terminal such as a facsimile apparatus, or by DPCM or vector quantization has an arrangement, as shown in, e.g., FIG. 5.
In FIG. 5, reference numeral 101 denotes a communication line such as a telephone line, an ISDN, a computer network, or the like; 102, a communication interface for extracting data supplied through the communication line 101; 103, a storage device comprising, e.g., a semiconductor, a magnetic storage device, or the like; 104, a decoder for decoding sequentially encoded data to generate image data; 105, a video frame memory; 106, a display comprising a liquid crystal display, a CRT display, or the like and 107, a recording device for recording image data on, e.g., a paper sheet.
In the conventional terminal, encoded data received through the communication line 101 and the communication interface 102 is directly stored in the storage device 103, or is immediately decoded by the decoder 104, and is then recorded on, e.g., a paper sheet by the recording device 107. Also, encoded data is temporarily stored in the storage device 103, and image data stored in the storage device is written in the video frame memory 105, as needed, thereby displaying image data on the display 106. Then, image data, which must be recorded on, e.g., a paper sheet, is selected based on the displayed data, and the selected image data is output from the recording device 107.
However, in the conventional communication terminal, since the resolution of the display 106 is too low as compared to the reading.recording density of image data, when sequentially encoded data is displayed while being decoded, the entire image cannot be displayed on the display 106, and a scroll display operation is required. As a result, it requires much time for a user to grasp the entire image.
In order to obtain desired image data from the image data stored in the storage device 103, a user must display and search reproduced images one by one on the display 106 at a high resolution, resulting in troublesome operations and a heavy temporal load.
Furthermore, sequentially encoded data cannot be directly subjected to hierarchical display or encoding.