1. Technical Field
The present invention relates to the field of handling data and more particularly, to a method and a device for handling data comprising a number of data objects in order to compress this data. The invention further relates to a computer program product with a computer-readable medium and a computer program stored on the computer-readable medium with program coding means which are suitable for carrying out such a method when the computer program is run on a computer. Furthermore, the invention relates, to a method for setting up a repository.
2. Description of the Related Art
In database systems particularly in regularly updated database systems use of data compression methods is well-known in order to save storage capacity. Due to information entropy increase compression efficiency decreases over time. Accordingly, the compression process requires increasing processing resources and furthermore, reveals compression results having low compression ratios. Hence, if the entropy of the data to be compressed is very high, size reduction will be very low and the compression will not be effective. In this case, only additional processing power and processing time will be consumed by the compression. The compression factor, i.e. ratio of uncompressed data to compressed data, will then fall below an acceptable value. Therefore, at this point users often decide to switch off the compression.
Data to be compressed often consist of many single data objects having different compression rates. Although some of the data objects have a good compression rate the overall compression rate of the whole data stream may be unacceptable. The problem is that the system conducting the compression needs to have knowledge of the data objects before the compression. However, analyzing the data objects before compression requires additional processing power and time and therefore, is not a valid option.
A file compression processor monitoring current available capacity on a file unit has been proposed in U.S. Pat. No. 5,675,789. The disclosed method is based on a threshold driven compression of files dependent on the capacity of the file unit. U.S. Pat. No. 4,847,619 discloses an adaptive data compression system which is reset when performance drops below a predetermined threshold.