Embodiments of the present invention relate to data processing, and more particularly relate to techniques for efficient generation of Cyclical Redundancy Check (CRC) values in network devices.
A Cyclical Redundancy Check, or CRC, is a type of function that is used to detect errors in digital data. A typical n-bit CRC (e.g., CRC-16, CRC-32, etc.) receives as input a data block represented as a binary value, and divides the binary value by a predetermined n-bit binary divisor to generate a remainder that is characteristic of the data block. The remainder may be used as a checksum to determine, for example, if the data block is later altered during transmission or storage. In the art, the term CRC is often used to refer to both the function and its generated remainder; however, for clarity, the present disclosure will refer to the function as the CRC and the remainder as the CRC value.
In the field of data communications, network protocols such as Ethernet, ATM, and the like employ CRCs to detect transmission errors in messages (i.e., packets or frames) that are sent from one network device to another. For example, in a conventional Ethernet implementation, a transmitting network device (e.g., router, switch, host network interface, etc.) generates a CRC-32 value for each outgoing Ethernet frame, and appends the value to the frame prior to transmission. When the frame is received at a receiving network device, the CRC-32 value is stripped and a new CRC-32 value is generated for the frame. The new CRC-32 value is then compared to the received CRC-32 value to verify the integrity of the data contained within the frame.
A problem with conventional CRC processing circuit implementations is that they cannot efficiently support the high data throughput rates demanded by emerging wire transmission standards such as 100G (i.e., 100 Gigabits per second (Gbps)) Ethernet. The data throughput of a CRC processing circuit is a function of its data line width and its clock speed. For example, a conventional parallel CRC circuit may process 64-bit wide data lines per clock cycle at a speed of 300 Megahertz (Mhz), thereby achieving a theoretical data throughput rate of approximately 64 bits*300 Mhz=19 Gbps, which is sufficient to support 10G (i.e., 10 Gbps) Ethernet. However, achieving a data throughput rate of 100 Gbps and beyond is difficult for conventional CRC processing circuits.
There are several reasons why conventional CRC processing circuits cannot efficiently support high throughput rates such as 100 Gbps. 100 Gbps generally requires a 10× or greater increase in either data line width or clock speed in a conventional 10 Gbps CRC processing circuit design, which is difficult to physically implement in hardware using currently available technologies. Implementing a large data line width also introduces timing issues at the gate level. Further, conventional parallel CRC designs require that the data lines of the input data stream be processed in order. Accordingly, in the case of Ethernet, an entire frame must be received by a receiving network device before a CRC value for the frame can be generated. This imposes a latency that makes it difficult to achieve theoretical data throughput rates, thereby further adversely affecting scalability.