In overview, in information theory, entropy is a measure of an uncertainty in a random variable, and pertains to physical continuous systems, for example as in thermodynamics, as well as to information systems, for example data communication systems. Moreover, in the context of data communication systems, entropy is more appropriately referred to as being Shannon entropy; reference is made to Claude E. Shannon's 1948 scientific paper “A Mathematical Theory of Communication” which is hereby incorporated by reference Shannon entropy is useable for quantifying an expected value of information contained in a given message. Furthermore, Shannon entropy is expressible in terms of nats and bans, as well as bits.
Shannon entropy provides an absolute limit regarding a best possible lossless encoding or compression of any communication, assuming that a given communication is susceptible to being represented as a sequence of independent and identically-distributed random variables. Moreover, Shannon's theory has indentified that an average length L of a shortest possible representation to encode a given message in a given alphabet is their entropy E divided by a logarithm of a number of symbols N present in the alphabet, namely as defined in Equation 1 (Eq, 1):
                    L        =                  E                                    log              10                        ⁢            N                                              Eq        .                                  ⁢        1            
Thus, the entropy E is a measure of unpredictability or information content. In a lossless data compression method, compressed output data generated by applying the method to corresponding input data has a similar quantity of information as the input data, but the output data includes fewer data bits relative to the input data. In consequence, the compressed output data is more unpredictable because it contains less redundancy of information therein.
Shannon's theory has been taken into account in the design of known data encoding apparatus. For example, in a published international patent application no. WO2010/050157A1 (PCT/JP2009/005548, “Image encoding apparatus, image encoding method, and image encoding program”, Applicant—Thomson Licensing), there is described an image encoding apparatus, an image encoding method and an image encoding program, which are operable to homogenize image quality of an image as a whole without lowering encoding efficiency, and which are operable at high speed, and are capable of reducing the size of circuit scale by performing macroblock shuffling without changing slice structure. Moreover, there is also provided an image encoding apparatus, including:    (i) a shuffling portion which collects and shuffles a plurality of macroblocks constituting image data from respective positions within an image;    (ii) an encoding portion which performs spatial frequency transformations and entropy encoding on the plurality of macroblocks which are collected and shuffled by the shuffling portion; and    (iii) a rate control portion which controls the encoding portion to adjust the rate of the plurality of macroblocks after encoding has been executed.
Data, irrespective of type of the data, require data storage space, and also bandwidth in communication network capacity when moved from one spatial location to another. Such bandwidth corresponds to investment of communication infrastructure and utilization of energy in reality. As volumes of data to be communicated are projected to increase in future, more data storage space and more communication system capacity will be needed, and often also more energy will be needed. In the contemporary Internet, there are stored huge quantities of data, often in multiple copies. Thus, any approach which is able to compress data, especially when such compression is lossless, is potentially of great technical and economic benefit. Contemporarily, there are several known methods for reducing entropy within data sets, for compressing the data sets. Moreover, there are known methods of modifying entropy present in data sets, for example Delta coding and Run-Length-Encoding (RLE), but new methods are still required which provide more effective data compression of data.
Thus, there are many different compression methods that are employable to compress given data by reducing entropy present in given data, for example aforesaid RLE coding, aforesaid VLC coding, Huffman coding, Delta coding and Arithmetic coding. These methods are typically designed to compress alphabets, numbers, bytes and words. However, such methods are not particularly suitable for compressing individual bits, and for this reason are not ideally suited for compressing data that can change in a bit-by-bit manner, for example streams of data bits.
In respect of conventional RLE coding, there is stored either a given value, namely bit, or two times the value, namely bit, and then a number of similar values, namely bits, after that. RLE can be applied selectively, for example, namely its coding can be reserved solely for coding runs including a known amount of bits for a number of similar values. Such selective application of RLE requires that the same value be put once or twice again into a given data stream with each new run. However, there is contemporarily required better approaches to data compression, which reduce entry in sets of data, independently of types of data.