1. The Field of the Invention
The present invention relates to compression technology. More specifically, the present invention relates to methods, systems and computer program products for performing compression and decompression of a sequential list of computer-executable instructions (also called herein an “executable list”) by uniformly applying a predictive model generated from one segment of the executable list as a common predictive starting point for the other segments of the executable list to thereby permit random access and decompression of the executable list even though the executable list was compressed using predictive compression techniques.
2. Background and Relevant Art
Computing systems have revolutionized the way people work and play. Original computing systems were rather monolithic, stand-alone mainframe computing systems often occupying entire rooms despite their relatively low processing and memory capabilities by modern standards. Currently, however, a wide variety of computing systems are available that are often even more powerful than their much larger mainframe ancestors. For example, a computing system may include a desktop computer, a laptop computer, a Personal Digital Assistant (PDA), a mobile telephone, or any other system or device in which machine-readable instructions (also called “program binaries” or simply “binaries”) may be executed by one or more processors. Computers may even be networked together to allow information to be exchanged electronically even over large distances as when using the Internet.
Despite monumental advances in computing technology, computing systems still have limited memory resources and network bandwidth that will vary depending on the computing system. In order to preserve memory resources and network bandwidth, compression technology is often employed to reduce the size of a data segment (such as a file, program, software module, software library or any other identifiable segment) with minimal, if any, loss in information. While there are many varying compression technologies, all compression technologies reduce the size of a data segment by taking advantage of redundancies in the segment. By reducing the size of the data segment, the memory needed to store the data segment and the bandwidth needed to transmit the data segment are both reduced. The power requirements for processing compressed segments are also often reduced which is especially relevant to low power environments such as mobile devices.
Text is often compressed as the semantic and syntactic rules that structure the text also introduce a high degree of redundancy in the text. Patterns can be detected in such text that allow one to make reasonable guesses as to the text that follows based on the text that was just read. Skilled human readers with sufficient reading comprehension skills can, for example, often reasonably predict how a sentence will be completed before even reading the entire sentence. Such prediction would not be possible if the text was simply a random sequence of arbitrary text characters, following no syntactic or semantic rules.
Due to the predictability of text, text is said to have a high degree of local sequential correlation. That is, a human can make reasonable predictions as to what text will follow, based on the immediately preceding text. Even computers can make such reasonable predictions by creating a statistical model that may be used to predict the text character that will follow based on the immediately preceding text characters. Such statistical models are often called predictive models. One compression technology that takes advantage of the high degree of local sequential correlation in text is called Prediction by Partial Matching compression or “PPM” compression for short.
While both compressing and decompressing, PPM builds a predictive model of the input data-stream that aims at estimating the probability that a certain symbol occurs after a certain context. When compressing (and decompressing) a particular text file, the model is gradually built as the compression (and decompression) proceeds from beginning to end through the text file. The state of the predictive model as it exists when evaluating a particular point in the text file is naturally heavily dependent on the text that was encountered prior to that point.
PPM and other predictive compression techniques were previously primarily used to compress text information. However, PPM and other predictive compression techniques have also been used to compress program binaries. As used herein, “program binaries” mean a sequence of machine-level executable instructions. Like they did for text, the predictive compression and decompression techniques build a predictive model of the program binaries as it compresses or decompresses the program binaries. Here, however, instead of using a human language alphabet for text, a different alphabet is used that represents each of 256 possible values in each byte of the program binaries.
While compression of files does reduce the amount of information that needs to be communicated over a network or to/from a mass storage device, it is always beneficial to improve the bandwidth use of the network when accessing the compressed program binaries over a network, and improve the bandwidth use of the local read/write channel when accessing compressed program binaries from a local mass storage device. Accordingly, what are desired are methods, systems, and computer program products for reducing the bandwidth usage needed to access and run program binaries (or any other sequential list of computer-executable instructions for that matter) whether over a remote or local channel.