Before turning to the invention an explanation of the background of compressed data transmission applied to different kind of transmission protocols and the efficiency of data compression is to be given. Inasmuch a major influence on the efficiency of a data compression process is the portion of the uncompressed data that can be used by the data compression algorithm to analyse the data and their structure. The more data the data compression algorithm can analyse, the greater the efficiency of the compression process is as long as the data have a more or less homogeneous structure. If, e.g., a text document has to be compressed, the compression result will be best if the compression algorithm can analyse a certain amount of the text before starting to compress. If the algorithm would have to handle small bits of the text independently from each other, the result would certainly be worse.
Apparently, an algorithm, that does not handle an amount of data as a whole but compresses the data block by block, each independently from the other, at the sending side of a data transmission and decompresses the data vice versa at the receiving side, will not reach a maximum of efficiency.
An algorithm which on the one hand breaks the data into smaller blocks but on the other hand stores information about each compressed block and thus about the last parts of the document already compressed, optimizes the compression of the respective following block and thus will work with improved performance. In this case, the information about the data increases with every block compressed, and the efficiency of the compression will, if the data is of an appropriate structure, increase as well. Since, as mentioned above, the compression is depending on the information that has already been gathered about the document, the receiver has to analyse the incoming data as well and collect information in the same way the sender does, to be able to decompress the data. As a result of analysing the already compressed or decompressed data, both sender and receiver have exactly the same information about the data at hand, when processing one particular piece of the data. That means in particular that the single parts of the document have to go through the compression algorithm in exactly the same order as they go through the decompression algorithm.
Now basically the data transmission between a sender and a receiver via a data network is organized by a protocol defining the rules of establishing a connection between sender and receiver and for the data exchange between same. Here a distinction is made in information technology between so-called connectionless and connection oriented protocols for networking. In the case of a connectionless protocol, sender and receiver do not exchange any information in excess of the actual data. If more than one data packet is sent, each packet is transmitted absolutely independent from any of the others and the sender does not get a confirmation whether or not the packet has been received. Thus it is not sure whether the first packet sent will be the first packet received, i.e. the order in which the packets are received does not necessarily equal the order in which they were sent. It is not even guaranteed that a packet has reached its destination at all.
In the case of a connection oriented protocol the data packets contain additional information that allow the communication partners to find out, when a connection starts and ends, in which order the single packets were sent and if a packet is missing. Moreover there is normally a definition that the sender waits for a confirmation-of-receipt reply of the receiver and what to do if a packet is missing, e.g. to resend the packet.
It is possible to use a connectionless protocol as a base protocol for a connection oriented protocol. With e.g. the TCP/IP protocol, the connectionless IP protocol is used to send connection oriented TCP packets, i.e. in TCP, IP packets which contain information about connection start and end and about packet order are exchanged.
From the above it is apparent that data traffic with connectionless protocols can not be compressed as efficiently as data sent over a connection oriented protocol. The reason is that according to the rules of the used transmission protocol, the data are separated into single packets before being sent separately and one after the other to the receiver. In case of a connectionless protocol one can only compress each packet on its own, independent from any other, because it is not known when the transmission of a connected amount of data begins or ends and in which order the packets arrive at the receiving end. Inasmuch the compression cannot be done as efficiently as if a connection oriented protocol is used, where the packets are sent in a specific order and the whole transmission can be compressed in context.
Opposite to a connectionless protocol—as stated above—when using a connection oriented protocol it would be possible to compress the complete set of data sent during a connection as if it would be one homogeneous document. The problem, however, is that it is not known when a single transmission ends and when the sender is waiting for the receiver to reply. In this case, the whole connection would die if the data would not be sent because the sender is still waiting for more data to compress. Thus, normally only that part of the data is compressed that the sending program puts into one block. To distinguish these blocks from the data packets created by the transmission protocol they are denominated as records. But instead of compressing each record separately from each other, the compression algorithm could collect information about each record which it uses to optimize compression for the next record as described above. This is called “record oriented compression”. The act of emptying the buffers used by the compression algorithm and transmitting the compressed data to the network protocol is called “flush”. With a record oriented compression such a flush is done after each record.
While record oriented compression is no problem using a connection oriented protocol, it can not be used with connectionless protocols, since there, as described above, single data packets are transmitted independently without taking care of their order. But even with connection oriented protocols record oriented compression is not necessarily the most efficient technique to use as can be seen from the following.