1. Field of the Invention
The present invention generally relates to an image processing apparatus and an image processing method, which access encoded data of a partial image by using a cache model; and in particular, in which a layer structure showing a relationship between packets of encoded data stored in a memory such as a cache is estimated and the packets of the encoded data stored based on the layer structure are referred to and a reference structure of the encoded data on the memory such as the cache is formed.
2. Description of the Related Art
Generally, an image processing apparatus has been known that accesses encoded data of a partial image stored based on a layer structure by using a cache model. In addition, an image processing apparatus has been proposed which realizes all of the encoding processes stipulated in JPEG 2000 (Joint Photographic Experts Group 2000), which is a next-generation image encoding system.
Encoded data are generally stored in a cache by packets comprising encoded data. When encoded data in the cache are used, the encoded data are referred to by collecting necessary packets.
Next, a layer encoding algorithm that is the basis of JPEG 2000 is described.
FIG. 25 is a block diagram for forming the layer encoding algorithm which is the basis of JPEG 2000. The block for forming the layer encoding algorithm provides a color space converting/reverse converting portion, a two-dimensional wavelet transforming/reverse transforming portion, a quantization/reverse quantization portion, an entropy encoding/decoding portion, and a tag processing portion (packet string forming portion).
One of the biggest differences of the JPEG 2000 algorithm from the JPEG algorithm is a transforming method. In JPEG, DCT (discrete cosine transform) is used; however, in the layer encoding algorithm of JPEG 2000, DWT (discrete wavelet transform) is used. The DWT has an advantage in that good image quality can be obtained in a high compression region compared with the DCT. This is the reason that JPEG 2000, which is a successor of JPEG, has adopted the DWT. In addition, as another big difference from JPEG, in order to form codes in the final stage in JPEG 2000, a functional block called the tag processing unit is newly added. In the tag processing portion, when encoding operations are executed, encoded data are formed as a code stream, and when decoding operations are executed, a necessary code stream is interpreted. By the code stream in JPEG 2000, various advantageous functions have been able to be realized.
FIG. 26 is a diagram which explains freely stopping compression and expansion operations of a still image on an arbitrary layer (decomposition level) corresponding to an octave division on the DWT of a block base.
For example, as shown in FIG. 26, in JPEG 2000, compression and expansion operations of a still image can be stopped on an arbitrary layer (decomposition level) corresponding to an octave division on the DWT of a block base.
As an input and output section of an original image, the color space converting/reverse converting portion is used. For example, in the color space converting/reverse converting portion, conversion or reverse conversion from a RGB color system composed of components of R (red), G (green), and B (blue) of primary colors or from a YMC color system composed of components of Y (yellow), M (magenta), and C (cyan) of supplementary colors into a YUV color system or a YCbCr color system is executed.
Next, the layer encoding algorithm (JPEG 2000 algorithm) is described in detail.
FIG. 27 is a diagram in which each component of an original image is divided into plural rectangular tiles. In FIG. 27, the RGB primary color system is used, and each component of a color image is divided into plural rectangular tiles. Each tile, for example, R00, R01, . . . , R15, G00, G01, . . . , G15, or B00, B01, . . . , B15 is a basic unit when compression or expansion operations are executed. Therefore, the compression or expansion operations are independently executed in each tile of each component.
At the time of encoding, data of each tile of each component are input to the color space converting/reverse converting portion (FIG. 25), the color space conversion is applied, and two-dimensional wavelet transformation (rectification) is applied to the data of the tile where the color space conversion is applied by the two-dimensional wavelet transforming/reverse transforming portion (FIG. 25) and the data of the tile are divided into spaces of frequency bands.
In FIG. 26, sub bands in each decomposition level are shown at the time when the number of the decomposition levels is three. That is, the two-dimensional wavelet transformation is applied to an original image tile (OLL) (decomposition level is 0) obtained by tile division of an original image and the OLL is divided into sub bands (1LL, 1HL, 1LH, and 1HH) shown in the decomposition level 1. After this, the two-dimensional wavelet transformation is applied to the low frequency component 1LL (decomposition level is 1) in this layer and the 1LL is divided into sub bands (2LL, 2HL, 2LH, and 2HH) shown in the decomposition level 2. Sequentially, the two-dimensional wavelet transformation is applied to the low frequency component 2LL (decomposition level is 2) and the 2LL is divided into sub bands (3LL, 3HL, 3LH, and 3HH) shown in the decomposition level 3. In FIG. 26, the sub bands to which encoding is applied in each decomposition level is shown with oblique lines. For example, when the decomposition level is 3, the sub bands (3HL, 3LH, 3HH, 2HL, 2LH, 2HH, 1HL. 1LH, 1HH) shown with the oblique lines are sub bands to which encoding is applied and the sub band 3LL is not encoded.
Next, bits to which encoding is applied are determined by a designated encoding order and a context is formed from bits surrounding the bits to which encoding is applied by the quantization/reverse quantization section.
A wavelet coefficient to which the quantization is applied is divided into rectangles called precincts which are not overlapped in each sub band. The precincts are adopted in order to effectively use memory at the time of implementation of the JPEG 2000. FIG. 28 is a diagram showing precincts each of which is formed by three rectangular regions whose spaces are matched with each other. As shown in FIG. 28, one precinct is formed by three rectangular regions whose spaces are matched with each other. In addition, each precinct is divided into rectangular code blocks that are not overlapped. Each code block is a basic unit when the entropy encoding is applied.
The entropy encoding/decoding section (FIG. 25) encodes tiles in each component by probability estimation from a context and bits to be applied. Then, all the components of the original image are encoded in each tile.
The minimum unit of encoded data formed at the entropy encoding/decoding section is called a packet. The packets are caused to be in a sequence in a progressive order, and the sequence of the packets is shown by one of image header segments. Packets are arrayed by progressive order data, for example, a region, resolution, a layer, and a color component. That is, in JPEG 2000, the following five progressions are defined by changing the priority order of the four image elements of image quality (layer L), resolution (R), a component (C), and a precinct (P).
The five progressions are a LRCP progression, a RLCP progression, a RPCL progression, a PCRL progression, and a CPRL progression. In the LRCP progression, since the precinct, the component, the resolution level, and the layer are decoded in this order, the image quality of the entire image is improved every progress of indexes of the layer, that is, the image quality progression can be realized. The LRCP progression is also called a layer progression. In the RLCP progression, since the precinct, the component, the layer, and the resolution level are decoded in this order, the resolution progression can be realized. In the RPCL progression, since the layer, the component, the precinct, and the resolution level are decoded in this order, the resolution progression can be realized similar to the RLCP progression; however, the priority of a specific precinct can be high. In the PCRL progression, since the layer, the resolution level, the component, and the precinct are decoded in this order, priority is given to decoding a specific part, and the space precinct progression can be realized. In the CPRL progression, since the layer, the resolution level, the precinct, and the component are decoded in this order, the component progression, in which, for example, a gray image is first reproduced at the time when a color image is progressively decoded, can be realized.
As described above, in JPEG 2000, an image is divided into regions (tiles and precincts), resolution, a layer, and a color component, and divided elements are independently encoded as packets. The packets are recognized and extracted from a code stream without decoding.
The tag processing section (FIG. 25, packet string forming portion) causes all of the encoded data from the entropy encoding/decoding section to be one code stream, and adds a tag to the code stream. FIG. 29 is a diagram showing a structure of a code stream. As shown in FIG. 29, tag information called a main header is added to at the head of the code stream, tag information called a tile-part header is added to each partial tile, and encoded data (bit stream) are positioned after each tile-part header. Further, tag information called an end of code stream is added to at the end of the code stream.
On the other hand, at the time of decoding, image data are formed from a code stream of each tile of each component. Referring to FIG. 25 again, decoding operations are briefly described. The tag processing portion interprets tag information added to a code stream input from the outside, and divides the code stream into a code stream of each tile of each component. The decoding operations are applied to the code stream of each tile of each component.
The positions of bits to be decoded are determined by the order based on tag information in the code stream, and in the quantization/reverse quantization portion, a context is formed from the array of bits surrounding the bits to be decoded. The entropy encoding/decoding portion forms bits by probability estimation from the context and the code stream and writes the bits in the bit positions. Decoded data are divided into spaces in each frequency band. Therefore, each tile of each component of image data is restored by the two-dimensional wavelet transformation at the two-dimensional wavelet transforming/reverse transforming portion. The restored data are converted into original color system data by the color space converting/reverse converting portion.
As a method accessing encoded data of JPEG 2000, there is JPIP (JPEG 2000 image coding system-part 9: interactivity tools, APIs and protocols). In JPIP, the encoded data received in a client are received at high speed without newly transferring the encoded data from a server; in order to achieve the above, the client provides a cache which stores a part of the encoded data.
In Japanese Laid-Open Patent Application No. 2005-12686, an image processing apparatus and an image processing method are disclosed. In this technology, in order to obtain a desired image of each tile from an apparatus which stores encoded data of an image divided into tiles and composed of plural packets, when data of a necessary number of packets are received, management information for managing each packet to be received is formed. Received packets are sequenced corresponding to a tile number to which each packet belongs, packets belonging to the same tile are sequenced again corresponding to the order in the tiles, and data of sequenced packets are added to the management information in order. Information of data disposition of each packet, in cache data including the added packet data and the management information, is registered in the management information.
In Japanese Laid-Open Patent Application No. 2004-274758, a method and an apparatus for converting a JPP stream into a JPEG 2000 code stream are disclosed. In this technology, a main header bin is written in an output JPEG 2000 code stream together with a marker segment in which a layer progression specifies the use of the last progression order. With respect to each tile, a regular tile header is written, and with respect to each component of each tile, the position, and the resolution, the number of completely received layers is determined. Bytes of a precinct data bin corresponding to a completed layer are written in the output JPEG 2000 code stream, and an empty packet header is written into each packet of each incomplete layer. In addition, an SOT (start of tile-part) marker segment is renewed for each tile in order to adjust the length into the number of bytes of packet data. An EOC (end of code stream) marker is written in the end of the output JPEG 2000 code stream. In this, the SOT marker segment signifies the head of the tile-part and shows an index of the tile and an index of the tile-part. The EOC marker signifies the end of the encoded data.
In Japanese Laid-Open Patent Application No. 2004-242273, an encoded data forming method and an apparatus thereof are disclosed. In this technology, a client stores first encoded data in encoded data managed by a server and calculates lacking second encoded data from encoded data required for forming JPEG 2000 encoded data and the first encoded data. The second encoded data are obtained from the server, the header information is analyzed, and the encoded data are divided into plural independent encoded data parts. Further, when all the independent encoded data are not stored, dummy encoded data are stored for each divided unit, and these encoded data are defined as JPEG 2000 encoded data.
In Japanese Laid-Open Patent Application No. 2003-169216, a method and an apparatus for forming encoded data are disclosed. In this technology, the method includes the steps of receiving fragmented encoded data to store in a memory, determining a progression order based on a displaying requirement of a user, revising header information based on the determination, determining whether the memory stores the corresponding encoded data based on the revised header information, reading the fragmented encoded data determined and stored in the memory, converting the read data into the progression order suitable to the displaying requirement, applying a ZLP (zero length packet) process to encoded data not stored in the memory, and forming encoded data in compliance with JPEG 2000.
However, in an image forming apparatus that performs encoding operations in compliance with JPEG 2000 using hardware circuits, the following problems occur.
That is, when encoded data of a partial image are downloaded and decoded by using JPIP, since the encoded data are fragmented partial data, the cases are few in which the downloaded encoded data (or, encoded data loaded and stored in a memory such as a cache) include all packets that comprise the encoded data, this is because the encoded data of the partial image include only a part of the packets of which the encoded data are composed.
In JPIP, there is a function in which packets of encoded data of a partial image transferred from a server to a client are stored in a cache (memory) and are reused in the client. That is, the client can efficiently reproduce the encoded data. New packets of encoded data of a partial image transferred from a server to a client are sequentially stored in a cache of the client. However, generally, the encoded data that are stored in the cache are stored by a packet unit which is a minimum unit of which the encoded data are composed. As described above, the encoded data of the partial image are a part of fragmented encoded data. Therefore, the packets stored in the cache cannot be efficiently utilized.
Further, in JPIP which refers to encoded data of a partial image, a data accessing unit is not a structural element unit but a packet which is a minimum unit of the encoded data. Therefore, when data are accessed by a structural element unit, a required number of packets to form the structural element unit are collected and data are decoded and reproduced by recognizing that necessary packets exist in the cache without a lacking packet. In the recognition of the lacking packet, since the total structure of the packets is unknown, the packets of encoded data cannot be efficiently referred to.
In addition, in JPIP, when encoded data of a partial image are loaded and decoded, since the encoded data are fragmented partial encoded data, cases where all the encoded data exist are few. Since there is a case where the encoded data are partially lacking, efficient access cannot be executed.