1. Field of the Invention
The present invention relates to a method, for compressing and decompressing integer-vector data, by which a large volume of numerically expressed information can be compressed and decompressed.
2. Related Art
To perform high speed processing with a computer, a process is performed whereby a large volume of information is accumulated as a so-called database in a memory area. The information types that are to be stored vary, and it is sometimes convenient for the various types of information to be represented as integer vectors.
As one example, to accumulate the results of questionnaires concerning leisure experiences, appropriate integers are assigned to N items for respondents, trips and sports, impressions of plays, places (country names) and the times at which the plays were watched, etc., and these integers are combined to represent the information as N-dimensional integer vectors.
As another example, when index information is to be provided at the end of an electronic book, N items, such as a header, a page number of a page on which a header appears, a line number, etc., may be stored. For this, headers can be conveniently represented by N-dimensional integer vectors by assigning appropriate integer to each header.
All the relationships among multiple items can very naturally be described by integer vectors as long as the individual item can be identified by integer values that are assigned to them.
Well known methods for storing N-dimensional integer vectors in the memory area of the computer are as follows.
First, a method, which shall be called background 1, will be explained by which integer vectors are rearranged in an ascending order in which priority is given to components with smaller numbers, and by which all the components of all the vectors are then rearranged in order. An example below shows a set T of three-dimensional integer vectors that are rearranged in ascending order:
T={(1, 1, 3), (1, 1, 5), (1, 1, 6), (1, 2, 1), (1, 2, 2), (1, 3, 2), (1, 3, 3), (2, 2, 1), (2, 2, 2)}. PA1 1, 1, 3, 1, 1, 5, 1, 1, 6, 1, 2, 1, 1, 2, 2, 1, 3, 2, 1, 3, 3, 2, 2, 1, 2, 2, 2. PA1 T={(1, 1, 3), (1, 1, 5), (1, 1, 6), (1, 2, 1), (1, 2, 2), (1, 3, 2), (1, 3, 3), (2, 2, 1), (2, 2, 2)}. PA1 defining transformation function G that satisfies these conditions: (1) all i-th component values (i is a component number selected from a range wherein i&lt;N) of vectors that belong to the set T are included in a domain of the transformation function, (2) the transformation function provides a many-to-one mapping, and (3) the transformation function is an increasing function; PA1 selecting component number j from a range wherein j&gt;i, while a k-th smallest value of i-th component values, excluding overlapping values, is defined as [k]; PA1 defining transformation function A that satisfies a condition such that when G([k+1])=G([k]), EQU A([k+1])+min(v.sub.j .vertline.v.epsilon.T, v.sub.i =[k+1])&gt;A([k])+max(v.sub.j .vertline.v.epsilon.T, v.sub.i =[k]); PA1 performing a transformation by the transformation function G so as to converge a distribution of the i-th component values of vectors that belong to the set T; and PA1 adding, to j-th component values of the vectors that belong to the set T, values that are acquired by applying the corresponding transformation function A to the i-th components. PA1 combining an i-th component v.sub.i, transformation function G(v.sub.i), and function A(v.sub.i) for each vector of the set T during compression of integer-vector data in a form of a three-dimensional vector (G(v.sub.i), v.sub.i, A(v.sub.i)); PA1 forming a three-dimensional vector set U from which overlapping is removed; and PA1 reproducing an N-dimensional integer vector set T by performing a reverse transformation of an N-dimensional integer vector set T' with the three-dimensional vector set U.
This set is stored in the memory area of the computer as follows:
In this example, a total of 27 integers are stored.
Although this method has an advantage in that the handling is easy, (dimensional number N.times.vector count) integers must be stored in the memory area. When a large volume of data is to be employed, the number of vectors to be processed increases and occupies the large memory area. This method, therefore, presents a processing difficulty.
A method, which will be called background 2, will now be explained by which integer vectors are rearranged in an ascending order in which priority is given to components with smaller numbers, and by which all the vectors are then rearranged in a tree structure. As an example, the set T of three dimensional integer vectors introduced above is shown below:
This set T is stored in the memory area of the computer in the following tree structure.
______________________________________ first component: 1 X 2 X .dwnarw. .dwnarw. second component: 1 Y 2 Y 3 Y 2 Y .dwnarw. .dwnarw. .dwnarw. .dwnarw. third component: 3 5 6 1 2 2 3 1 2 ______________________________________
wherein X is information that indicates the beginning of corresponding second component information; and Y is information that indicates the beginning of corresponding third component information. The sum of the components and X and Y information items is 21, and the volume of data to be stored is reduced compared with that stored by the background 1 method.
3. Objective
Actually, however, when the number of data to be stored becomes great by increasing the number N of the components of the N-dimensional integer vectors, or by increasing the number of the vectors in a set, a large memory area must be acquired even in the above described background 2 method.
Therefore, more efficient data compression method than those of the backgrounds is required. In addition, fast, efficient decompression method for a large volume of compressed data is also required. When the results of questionnaires concerning leisure experiences are accumulated as mentioned above, only the accumulated information that satisfies specific conditions tends to be referred to; for example, only information concerning places (country names) may be referred to. It is therefore preferable that only the compressed data that is necessary be decompressed, rather than all of the compressed data being decompressed, so that the required data can be obtained within a short time.
It is therefore one objective of the present invention to provide a method for compressing and decompressing integer vector data, by which N-dimensional integer vectors are compressed efficiently, and to ensure that only those vectors that satisfy specific conditions will be quickly decompressed and extracted.