1. Field of Invention
The invention relates to information technology, and more particularly to techniques for compressing databases containing large amounts of information.
2. Description of Related Art
In relational database technology, one of the most important questions affecting storage requirements is the level of indexing, or pointers into related data fields, required. Indexing can be expensive in terms of storage in database tables with many lines per index entry. Unless index information is implicit in the physical position of the data, as in some implementations of clustered indexes or hash indexes, the creation of an index can greatly expand the total storage requirements. The implementation used in such commercially available systems as SQLServer, Informix, and Sybase in fact causes the index information to be repeated for every row of information. With long index fields, this becomes costly.
Since repeated byte strings are easily represented in a compressed format, such situations, especially where very large databases (VLDBs, typically terabyte) are concerned, are prime candidates for the use of compression technology. It is desirable, though, to find some intermediate ground between known large-block, high compression algorithms which degrade access time for small data segments and impede indexing capability, and the usual VLDB situation in which no compression is employed.