1. Technical Field
The present invention relates to a method and system for compressing data in general. More particularly, the present invention relates to a method and system for compressing an executable program.
2. Description of the Prior Art
As software systems become more complex, the executable code of the programs implementing these systems have grown large in size. The large code size reduces instruction cache effectiveness and utilization of memory resources. It also increases program-loading time when code is shipped over in a network environment or retrieved from a slow mechanical device like a disk.
Currently, network computers, embedded controllers, set-top boxes, hand-held devices and the like receive executables over a network or possibly through slow phone links or communication channels. These devices may have very limited memory capacity and when their memory is constrained, large programs may not fit in the available memory to run on the device. Highly efficient code compression mitigates the disadvantage of large executable sizes. However, compressing executable programs has traditionally been difficult.
The fundamental reason of why compressing executable code is difficult is the lack of common patterns that a traditional compressor can discern easily. In prior art, compression schemes use some form of spatial or temporal proximity when analyzing the input stream. Alternatively, a traditional compressor may use some form of a histogram to gather information about the various patterns that occur within an input stream. In either case, similar instructions that follow a common pattern may not necessary occur close to one another. Furthermore, forming a histogram may not be straightforward if instructions do not follow uniform formats, such as in Complex Instruction Set Architecture (CISC) or an elaborate Reduced Instruction Set Architecture like the PowerPC, which contains over 30 different instruction formats.
Disclosed herein is a better approach than the traditional schemes, where instructions are clustered within a program according to common patterns. The instructions within each cluster are then compressed independently by an appropriate compressor. Operating on a cluster instead of the entire instruction stream, a traditional compressor is effective in producing compact code due to the ease of discerning patterns among structurally similar instructions, and the limited number of patterns within each cluster. It is also desirable to use different compressors for different clusters. Therefore there is a need to maximize the compression of an overall program by grouping instructions within the program into clusters, then compressing each cluster using the compression scheme that would yield the best results for that particular cluster. The present invention solves these problems by presenting a technique in a novel and unique manner, which is not previously known in the art.