My invention relates to compressing and decompressing data (acoustic and still/moving images). More precisely, the invention is directed to the still document image data compression utilizing the adaptive vector quantization theory (AVQ) and its implementation as reflected in the Kohonnen Self Organizing Feature Map (KSOFM) neural model and the Adaptive Resonance Theorem (ART). My work is in the same direction of the well known Wavelet and JPEG image compression techniques with the major difference in the make of the lookup tables, i.e., codebooks. The formation of the lookup tables in my model is carried out via the training of hybrid SOFMandART neural models, with some modification. The main motivation of my invention is the poor performance (and sometimes the inability to perform at all) of the aforementioned peer techniques in some image domains (e.g., document image compression), as we experimentally discovered. For contrast, in some document image domains, my model compression engine performs exceptionally well, both in the compression ratio and the image quality compared to Wavelet and JPEG. For some very complex documents with complicated colors charts writings, and backgrounds, Wavelet (represented by the state of the art commercial product, DjVu Solo) failed to compress/decompress the document correctly, while maintaining a competitive compression ratio. Their recovered decompressed documents lost the original colors and/or text is distorted, in some places. The DjVu applies the character recognition approach in compressing documents, a feature which caused such deficiency in the compression performance. In the same document domain, my model is able to recover correctly xe2x80x9callxe2x80x9d documents that the peer technique could not recover. Even though another Wavelet commercial product, MrSID, was able to recover such images correctly, as pictures instead of documents, my technique performed much better in compression ratio for the same (or better) visual quality.
My model, as well as Wavelet/JPEG, utilizes the vector quantization (VQ) approach. The VQ mechanism clusters the input subunit vectors (k-dimension) via the mapping of all similar (based on some error difference) into one representative vector called centroid. A difference measure between vectors X and Y is the squared error ∥Xxe2x88x92Y∥2, Euclidian distance. All obtained centroids are stored in a simple lookup table (codebook) which is to be used at the encoding/decoding phases. The manufacturing mechanism and the nature of the generated codebook is the key distinction of our work.
Based on the VQ approach, a new hybrid neural model (CD) that combines some features of the well known neural models: KSOFM and ART. The CD model follows the main theme of both models and adds its own modifications to exact details of their learning mechanisms. The Kohonnen SOFM is an unsupervised neural model implementation of the AVQ theorem. It uses a set of neurons each of which is connected to the input training vector via a set of weights connections called neuron weight vectors (NWVs). The NWVs are initialized to some random small values (or some pre-known values) then with each introduction of a training vector X, the closest NWV to X (based on their Euclidean distance, ED) will be declared a winner and brought closer to X (and for that matter, all similar vectors to X). Not only the winner NWV will be updated, but also all of its neighborhood vectors with decreasing values. The training of the SOFM engine is unsupervised and theoretically terminates at centroid conversion time, i.e., when the movement of all centroids is virtually zero, for two consecutive training epochs. The above stopping condition results in a training time complexity of Q(N2), where N is the input/centroid vector length.
The other clustering neural model is the ART. Even though SOFM and ART share the same goal (AVQ), yet their training is totally different. In ART, upon receiving an input subunit, a quick scan to the existing set of centroids will result in the formation of a possible winners set of centroids, PWC, of the input subunit. Among the PWC members, the input subunit will be more accurately (pixel-by-pixel) matched to each member centroid; and the closest match is announced the winning centroid. A failure to identify a subunit membership, to some class (under some matching constraints), forces the addition of a new centroid with the subunit value, as the only member in a newly formed class.
In our unsupervised clustering CD model, the centroids codebook table is initially empty. As the training progresses, we apply the ART philosophy: xe2x80x9cif there is no true winner centroid for a subunit, add the subunit as a new centroid in the codebookxe2x80x9d. The decision of a true winner centroid, for an input subunit, is made in two phases. We initially construct a PWC set (stated above) of centroids that have the number of pixel differences, with the input subunit, less than some tight threshold, T1. Then, comparing the input subunit with each member in the PSC set, we add up the pixels"" error differences (pixel by pixel) to obtain a mean error value, ME. The centroid with the minimal ME is announced the true winner of the input subunit, i.e., its representative, given that the computed ME does not exceed some threshold, T2. Only the true winner centroid value is updated, using the input subunit. Otherwise, if there is no true winner for the input subunit (based on T1andT2 above), the input subunit is to be added to the codebook, as its own new centroid. Thus, in deciding the true winner, we follow the sole of the ART. But, in the process of updating the winning centroid, we follow the mechanism of SOFM xe2x80x9cwinner-takes-allxe2x80x9d. The winning centroid only will be moved closer to the input subunit, by updating its value to be the average of all of its class members with the newly added subunit. Thus, we can say that our CD model is a mixture of a simplified SOFM and ART models, gaining the training speed/simplicity of SOFM and the ART elasticity/accuracy.
We present a direct classification (DC) model for pictorial document or acoustic data compression which is based on the adaptive vector quantization theory (AVQ). We deploy a quantization mechanism that utilizes some concepts of the SOFM and ART neural net models. Starting with an empty synaptic weight vectors (code book), a very simplified ART approach is adopted by the DC model to build up the codebook table. Then, the training and the updating of the centroids in the codebook is based on some of the KSOFM rules. Simply, the DC model divides the input data (e.g., document image) into N same size (n2 gray/colored pixels) subunits. Then, divide the N subunits into L equal size regions, each of s subunits (s=N/L), grouping all same index ik (0 less than =k less than =(sxe2x88x921) subunits in all regions into a xe2x80x9cruni(k)xe2x80x9d of size L subunits. Notice that, selection of subunit ik from a given region, when constructing a run, is in any order, yet is consistent over all regions per each run. The concatenation of run1 through runsLxe2x88x921 forms the total input domain of subunits. For some input domains, e.g., pictorial images, the construction of the regions and the selection of subunit per a region is done for one subunit selection only (total L subunits) to initialize the L centroids in the codebook. Then, after codebook initialization, the input subunits domain is just the remaining N-L subunits scanned sequentially, with no regional divisions. Then, the DC model clusters the input domain into classes of similar subunits (close orientation vectors); each class will have a vector representative, called a centroid. The mechanism incrementally constructs a codebook of centroids to be placed in the DC neurons synaptic weight vectors. Each centroid will have a limited size (size=TrainSetMax) subset out of all classified subunits (SCS), that will be the only contributing subunits in its training. The reason we restrict the centroid updating to only a subset of the class member subunits is to enhance quality at decompression. We consider the first TrainSetMax subunits to be the closest to the formed centroid, by the locality principle. Any other incoming subunits exceeding TrainSetMax might falsely update the centroid value away from the already used subunits, resulting in poor centroid representation training subunits. Each centroid represents the center of mass (average) of all classified member subunits. The centroid index (in the hosting codebook) is used to replace any member subunit in the final compressed image file of indices. At the decompression phase, indices in the compressed file will be used to obtain corresponding centroids from the codebook and store them, in order, at the decompressed file.
The goal of the first DC processing phase (training/compression) is to build up the centroid codebook, while obtaining the winning centroid (WC) for every input subunit and store its indices file. The process of obtaining the WC has two steps. First, all possible winning centroid sets (PWC) are established, based on very strict similarity measure (pixel-by-pixel within very tight threshold). The DC model starts by sequencing through the divisions of the original image domain of subunits. Every input subunit S will be checked against all existing nonempty codebook table entries, one by one, forming the PWC set of centroids with the closest pixel distance (PD), steps 12 through 18. The PD calculation is based on pixel-by-pixel comparison between S and every centroid in the codebook, while associating the number of different pixels (ND), within some threshold value (T1). All centroids with ND less than some threshold count (T2) will be grouped in a PWC set for S; an idea which is borrowed and modified from the ART approach. An empty PWC asserts that there is no similar centroid to S (based on T1andT2), and we prepare to insert S in the codebook as a new centroid. In order to insert S in the codebook, we check the codebook for free entries. If there is a free entry, we insert S in it as a new centroid and add S to the new centroid SCS, i.e., the class membership, as the first member. Then we use its new table entry as the S compression index CIs. If the codebook is full, then S will be associated with the nearest centroid in the codebook (based on the error ME measurement), i.e., we consider the total codebook entries as the PWC for S. Then, we calculate the ME between S and every centroid member of the obtained PSW. The centroid with the minimal ME reading will be the winner, i.e., S""s WC. For the obtained WC, we add S to its SCS; and if the size of SCS does not exceed some limiting threshold, S is allowed to update WC. The mechanism of updating only the wining centroid is based on the KSOFM model xe2x80x9cwinner take allxe2x80x9d. We update the WC by re-computing all of its pixel values as the average of all member subunits pixels, including its newly added member S, if possible. Finally, we store the WC""s codebook index as S""s CI in the compressed file of indices. We repeat the process for every input subunit S in the original file. At the end of the training/compression phase we add the obtained set of synaptic vectors (the codebook) at the end of the compressed file of CIs (indexes file). For compression ratio efficiency, we compress the obtained compressed image file using the lossless LZW technique. Notice that the index""s size is an order of magnitude shorter than the size of the original input subunit vector; thus yielding the effect of image data compression.
The second phase of the process is the decompression step. An indices file is to be LZW unzipped. Each index in the indices file, in sequence, is used to obtain a centroid from the codebook, which will be placed in the decompressed file. The decompressed file is an approximation of the original image file, within certain controlled quality (based on the user choice of the codebook limiting size, and the different used thresholds, T1, T2 above).