The present invention relates to calculating checksums, and in particular to calculating a Fletcher checksum using a CPU with single instruction, multiple data (SIMD) capability.
When data are transmitted, corruption of this data can occur during transit. It is therefore necessary to utilize the appropriate safeguards. These usually take the form of a number of redundant check bits which can be generated by the sender, as some function of the associated data elements, and give the recipient a very good indication of whether the data have arrived intact. Appending the data to be transmitted with this "signature", and causing the recipient to generate its own version of the data signature, using exactly the same generation algorithm, allows the recipient to detect corrupted data (indicated by mismatched signatures). Error correction can then be achieved by requesting retransmission of the corrupted data packet. The two most common methods for performing this error detection are checksums or Cyclic Redundancy Checks (CRC).
While numerous variants of these two different methods exist, success can only be achieved if the participants in the data exchange agree to use the same error detection method. To this end, communication protocols exist to standardize, amongst other considerations, the error detection methods employed during data transmission. A variety of different communication protocols are widely used today, each exhibiting different desirable qualities. For instance, for TCP (Transmission Control Protocol) the domain for error detection is the entirety of the data packet, while for IP (Internet Protocol) only the headers are checked. (Generally, a header provides vital information about the destination address for the packet and is attached to the front of the data segment).
For the more rigorous protocols, error detection can be a time consuming process which limits the machine's throughput. It is for this reason that checksums are often utilized for error detection, while their error detection ability is generally inferior to that exhibited by CRCs, CRC checks, which are easier to implement in hardware, are not particularly suited for software implementation. Generally, computers exhibit a better performance with integer computations. This problem was addressed by J. G. Fletcher who developed the Fletcher Checksum as an alternative to CRCs (J. Fletcher, "An Arithmetic Checksum for Serial Transmissions", IEEE Transactions on Communications, vol. COM-30, p. 247, January 1982). The Fletcher checksum is an integer arithmetic checksum that exhibits a reasonable level of error detection, and lends itself to software implementation on non-dedicated processors.
Reducing the overhead of error detection increases transmission efficiency and allows higher data transmission rates. Experimentation suggests that throughput can be significantly affected by the implementation of the error detection algorithm utilized. On one occasion it was reported that the throughput tripled when the implementation of Fletcher's checksum utilized was changed from an unoptimized version to an optimized version, (A. Nakassis, "Fletcher's Error Detection Algorithm: How to implement it efficiently and avoid the most common pitfalls", ACM Comp. Commun. Rev., vol. 18, p. 63, October 1988). It is suggested here that, with protocols becoming increasingly streamlined and data rates constantly increasing, checksum performance can act as a performance bottleneck.
For communication over the Internet there are specified error detection parameters that constrain the error detection techniques utilized by the relevant protocols e.g. TCP, IP and UDP. One checksum implementation that satisfies the Internet error detection parameters is the Fletcher Checksum.
The Fletcher Checksum
Unlike CRCs, which involve the computation of polynomials, an arithmetic checksum is generally implemented as some form of linear function over the data to be transmitted. The arithmetic Fletcher checksum is generated by the following pair of iterative series:
C(0).sub.1 =d(0); C(1).sub.1 =C(0).sub.1 PA1 C(0).sub.2 =C(0).sub.1 +d(1); C(1).sub.2 =C(1).sub.1 +C(0).sub.2 PA1 C(0).sub.N =C(0).sub.N-1 +d(N-1); C(1).sub.N =C(1).sub.N-1 +C(0).sub.N PA1 R(0)=(C(0).sub.N)mod256 PA1 R(1)=(C(1).sub.N)mod256 PA1 checkresult=R(0)+(R(1))&lt;8)
where N is the number of data elements, d(X), in the packet.
A desirable property for a checksum to display is uniformly distributed checksum values. Unfortunately there is a tendency for the checksum results to display a normal distribution. However, using the lower bits of a larger checksum provides a value with a better distribution, (T. Kientzle, "The Working Programmer's Guide to Serial Protocols", Coriolis Group Books, 1995.) Consequently, if the data to be transmitted is a sequence of bytes, then the checksum results, or check bytes as they are often called, are appended as two octets to the data packet undergoing checksumming. Hence, computation of the Fletcher checksum is facilitated by: