It is known that Huffman coding may be used to improve the efficiency of transferring information over a communication link. In Huffman coding, the code length for each character or group of characters is based on the probability of occurrence of the characters represented by the code. A frequently-used character is assigned a short code, whereas an infrequently-used character is assigned a long code. If one data source is used, a Huffman code can be readily generated based upon the frequency of occurrence of each of the characters. For example, if English is the source, the letter "e" may be assigned a short code and the letter "z" a long code. Different data sources may require different Huffman codes. For example, Huffman code for normal English text may not be very efficient for Greek text or for FORTRAN data. Implementation complexity is thus increased in a multisource system if compression efficiency is maintained. Of even more importance in data communication, the source may in general fall within one of three basic types: (1) principally regular text, (2) principally numbers, or (3) principally text in capitals, each of which is likely to require a very different Huffman coding.
For a data communication application, Huffman coding may be used to reduce the number of bits required to send normal ASCII text data by substituting different unique codes for each character to be sent. By substituting short codes for those characters used frequently and longer codes for those characters used infrequently, a reduction in the total number of bits sent is achieved even though some characters may take as many as 20 bits. In order to enhance transmission efficiency, it is desirable, however, to create a different Huffman code for each type of incoming data. For example, a code set that will yield good efficiencies for sources of principally capital letters such as FORTRAN data sources, or one for sources which are principally numbers such as financial reports might each yield a gross inefficiency for normal text data which contains a mixture of upper and lower case letters and numbers. Thus, each of these types preferably would be encoded with their own distinct Huffman code. It is therefore desirable to be able to match the Huffman code to the source data so that good compression efficiency is maintained without regard to source type.
In addition, there are at least two more basic problems encountered in attempting to use Huffman coding in a microprocessor based multiplexed data communications link. The first is that many message source types may require 16 or more bits to encode seldom used characters with regular Huffman coding. If 8 bit microprocessor and memory are used to implement such a link, this could result in requiring at least triple precision memory to be used for storing the coded characters, using extra memory and processing time and might also require bus transfers of more than the standard 8 bits.
The second problem in using Huffman coding in a multiplexer is encountered at the receiving end of the multiplexed transmission link. Since the incoming characters are of variable length, the incoming bit stream must typically be decoded and examined bit by bit to determine where each character starts and ends. Frequently the data multiplexed at and sent from one port of the transmitter, and destined for a particular port (an individual low speed I/O) in a multiplexed receiver, could be using a different Huffman encoding table from that used to multiplex data from another port in the transmitter, in order to obtain the maximum compression efficiency. Thus the decoding of the incoming bit stream using all the different encoding tables in parallel will impose an intolerable task on the aggregate logic which has the task of combining the multiple paths of low speed data into multiplexed high speed data. If the decoding were to be done in each low speed I/O port logic, at normal data transmission rates the number of transfers between the aggregate logic and the various low speed I/O port logic could very well exceed the capacity of the interconnecting bus.
The problems enumerated in the foregoing are not intended to be exhaustive, but rather are among many which tend to impair the effectiveness of multiplexed data transmission employing compression encoding wherein different kinds of data sources are multiplexed together.
It is therefore a general object of the present invention to provide an improved method and apparatus for the multiplexed transmission of data from various different types of data sources which are compaction or compression encoded for transmission. The terms compaction and compression are used interchangeably and considered synonymous for purposes of this document.