1. Technical Field
The present invention relates to a method and system for compressing data in general and in particular to a method and system for compressing executable code. Still more particularly, the present invention relates to a method and system for compressing executable code in the context of Reduced Instruction Set Computer (RISC) architectures.
2. Description of the Prior Art
Reduced Instruction Set Computer (RISC) architectures simplify processor and compiler design by making all instructions have the same size and follow a few simple formats. A price to pay for these advantages is the large size of executable program code written using these instruction sets. The large code size reduces instruction cache effectiveness and utilization of memory resources. It also increases program-loading time when code is shipped over in a network environment or retrieved from a slow mechanical device like a disk.
Currently, network computers, embedded controllers, set-top boxes, hand-held devices and the like receive executables over a network or possibly through slow phone links or communication channels. Additionally, these devices may have very limited memory capacity that make large programs not fit in the available memory to run on the device. Therefore, for devices using RISC processors to be competitive in the market place, they may require highly efficient code compression that mitigates the disadvantage of large executable sizes.
Executable code written for RISC processors has traditionally been difficult to compress. Therefore there is a need for compressing instructions in a reduced instruction set computer (RISC) architecture such as the PowerPC family owned by International Business Machines. Traditional compressors in the prior art treat the instructions in a program as a stream of bits, and try to find patterns within this stream to help construct a more compact presentation of the program (e.g. Ziv-Lempel compression, Huffman encoding, etc.). However, RISC instructions often contain redundant fields. Redundant fields pose two problems. They pollute the compression model that a traditional compressor builds as it compresses the data, and therefore it will produce lower quality compression. Another problem with redundant fields is that they do not carry any information, yet a traditional compressor needs to generate code for them. The compressed code, however small it may be, does not convey any information. It one instead exploits the semantics of the instructions, a better solution is to eliminate the redundant fields so that the compressor does not have to generate code for them. These redundant fields then can be reconstructed during decompression in a straightforward manner. Therefore a need exists for a technique of identifying redundancy in RISC instructions and utilizing this information with commercial compression methods to yield better compression results. The present invention solves this problem by presenting a technique in a novel and unique manner, which is not previously known in the art.