Statistics collection is an important function performed by integrated circuits (“ICs”) in various applications. In data networking applications, for example, statistics are essential for managing bandwidth and quality of service. These statistics typically include a number of packets, bytes, etc., in a network flow or data path. Maintaining and updating these statistics is a challenging task as network applications may demand statistics for thousands of flows at ever increasing traffic rates. To keep up with these demands, a rapid update cycle is required for each statistic.
In a network device such as a switch or router, a field programmable gate array (“FPGA”) may be used for statistics collection. An FPGA is an IC that can be programmed in the field after manufacture. FPGAs typically contain programmable logic components and programmable interconnects. The programmable logic components can be programmed to duplicate the functionality of basic logic gates such as AND, OR, XOR, NOT or more complex combinatorial functions such as decoders or simple math functions. In most FPGAs, these programmable logic components (or logic blocks, in FPGA parlance) also include memory elements, which may be simple flip-flops, registers, or more complete blocks of memories. A hierarchy of programmable interconnects allows the logic blocks of an FPGA to be interconnected as needed by the system designer, somewhat like a one-chip programmable breadboard. These logic blocks and interconnects can be programmed after the manufacturing process by the customer/designer (hence the term “field programmable”) so that the FPGA can perform whatever logical function is needed. In addition to an FPGA, statistics collection typically requires the use of one or more large memory devices external to the FPGA for storing the statistics. The memory device may be a random access memory (“RAM”) device such as a quad data rate (“QDR” or “QDR™II”) synchronous RAM (“SRAM”) device or a reduced latency dynamic random access memory (“RLDRAM”) device.
Now, the total round-trip delay (i.e., the time to read from a location in memory including internal synchronization) is a serious bottleneck in maintaining per-flow statistics at the rate of packet arrival at the network device. In order to speed up memory access for statistic updates, simple pipeline techniques can provide a partial solution. However, a limitation of simple pipeline techniques is apparent when multiple updates for a particular flow are required for a period of time that is shorter than the round-trip delay. In such a case, each statistic update requires the previous update to have been completed.
One solution to this problem is to separate each memory access by the total round-trip delay due to the random nature of packet arrivals. However, such a solution would cause a slow down in the speed of operation of the statistics system and would necessitate additional storage due to the slow down. Another solution is to separate the memory accesses for a given flow by the round-trip delay. However, this solution would only alleviate the problem to a limited degree without totally eliminating the need for extra buffering and speed reduction due to bursts in packet arrivals at the network device.
A need therefore exists for an improved method and system for updating network flow statistics for a network device gathered by a processor and stored in a memory device external to the processor. Accordingly, a solution that addresses, at least in part, the above and other shortcomings is desired.