1. Field of the Invention
The present invention generally relates to data compression and more particularly to a method and system for data compression with dictionary pre-load of a set of character strings that can be expected to appear only once or a few times in an input data stream.
2. Discussion of the Background
In recent years, various compression algorithms have been developed. For example, the DEFLATE compression algorithm operates in the IP Payload Compression Protocol (IPComp) application. The DEFLATE algorithm improves upon the Lempel-Ziv 1977 (LZ77) compression algorithm by providing a second compression step that takes the compressed output of LZ77 algorithm and further compresses it using either fixed or dynamic Huffman coding.
Similarly, the Lempel-Ziv-Jeff-Heath (LZJH) data compression algorithm has been developed (e.g., as further described in commonly owned U.S. Pat. Nos. 5,955,976; 5,973,630 and 6,292,115 to Heath incorporated by reference herein) and includes improvements in the data compression via minimum redundancy coding, such as fixed Huffman coding, dynamic Huffman coding, etc. (e.g. as further described in commonly owned U.S. patent application Ser. No. 10/054,219 of Heath (Dock. Nos. 10792-1052/PD-201167), entitled xe2x80x9cMETHOD, SYSTEM AND COMPUTER PROGRAM PRODUCT FOR LZJH DATA COMPESSION WITH MINIMUM REDUNDANCY CODING,xe2x80x9d filed on Nov. 9, 2001 incorporated by reference herein.).
However, although the DEFLATE and the LZJH algorithms, being adaptive, represent a redundant character string by a compressed code after encountering the character string within an input data stream at least twice, such algorithms do not take advantage of character strings that can be expected to appear only once or a few times per input data stream.
Therefore, there is a need for a method and system for improving data compression with respect to character strings that can be expected to appear only once or a few times per input data stream.
The above and other needs are addressed by the present invention, which provides an improved method and system for data compression with dictionary pre-load of a set of character strings that can be expected to appear only once or a few times per input data stream. Advantageously, data compression can be improved by pre-loading encoder and decoder compression dictionaries with a set of expected character strings that can, depending upon a specific application, be expected to appear in data to be compressed.
Accordingly, in one aspect of the present invention, there is provided an improved method, apparatus and computer program product for encoding data transmitted over a communications channel, including pre-loading an encoder dictionary with a set of character strings expected to appear in input data to be encoded; and encoding the input data with the set of expected character strings pre-loaded in the encoder dictionary.
In another aspect of the present invention, there is provided an improved method, apparatus and computer program product for decoding encoded data received over a communications channel, including pre-loading a decoder dictionary with a set of character strings expected to appear in the encoded data; and decoding the encoded data with the set of expected character strings pre-loaded in the decoder dictionary.
Still other aspects, features, and advantages of the present invention are readily apparent from the following detailed description, simply by illustrating a number of particular embodiments and implementations, including the best mode contemplated for carrying out the present invention. The present invention is also capable of other and different embodiments, and its several details can be modified in various respects, all without departing from the spirit and scope of the present invention. Accordingly, the drawing and description are to be regarded as illustrative in nature, and not as restrictive.