A computer network is a geographically distributed collection of interconnected communication links for transporting data between nodes, such as computers. Many types of computer networks are available, with the types ranging from local area networks (LANs) to wide area networks (WANs). The nodes typically communicate by exchanging discrete frames or packets of data according to pre-defined protocols, such as the Transmission Control Protocol/Internet Protocol (TCP/IP) or the Internetwork Packet exchange (IPX) protocol.
The topology of a computer network can vary greatly. For example, the topology may comprise a single LAN containing a single intermediate node of a type such as, e.g., a hub, with end nodes attached to the hub. A more complex network may contain one or more local area networks interconnected through a complex intermediate internetwork comprising a plurality of other types of intermediate nodes, such as switches or routers, to form a WAN. Each of these latter intermediate network nodes typically contain a processor that enables the intermediate node to inter alia, route or switch the packets of data along the interconnected links from e.g., a source end node that originates the data to a destination end node that is designated to receive the data. Often, these intermediate network nodes employ packet buffers to temporarily hold packets that are processed by the nodes.
Packet buffers often comprise one or more memory devices that are arranged to form one or more first-in first-out (FIFO) queues, where each queue may have a level (class) of service associated with a particular input or output line. The size of each “service” queue often depends on the rate of the line associated with the queue, as well as the time it takes for a packet to be processed by the intermediate network node. For example, assume an input line on an intermediate node has a line rate of 1 Gigabits per second (Gb/s) and a packet takes 250 milliseconds (ms) to be processed by the node. The service queue size can be determined by multiplying the line rate times the processing rate, thus yielding a queue size of at least 250 megabits (Mb).
In addition to the processor, the intermediate network node may comprise a packet memory system and an input/output (I/O) system. Packet data is transferred between the packet memory, processor and I/O system over an interconnect, such as a bus, comprising address, data and control lines, with the control lines carrying control signals specifying the direction and type of transfer. For example, the processor (i.e., a requestor) may issue a read command request over the bus to retrieve packet data from an addressed location in the packet memory coupled to the processor. The processor may thereafter issue a write command request to store additional packet data in the same or another addressed location in the memory system.
The speed of the memory read request impacts the throughput of the intermediate node because resources of the node, such as the processor, often cannot continue operating until they receive the requested information. The speed of the memory write request, however, is not as significant since the processor does not immediately need the write data once it is written to the memory system. As a result, the intermediate node may include a write buffer configured to store write data associated with write requests destined for the packet memory. Use of the write buffer enables write operations to the memory to occur at efficient times, while further enabling read operations to be serviced before write operations. In addition, use of the write buffer between the memory system and processor reduces the amount of flow control to the processor. For example, use of the write buffer reduces the possibility that a memory controller may need to exert flow control to a requester, such as the processor, because it could not accept more write data.
In a typical memory system employing a write buffer, data coherency plays an important role to ensure that the latest data are delivered to the requester. In this context, data coherency is defined as ensuring that all copies of a data set always reflect the same, and most current, data values. One approach to resolving data coherency in such a memory system is at a common data size. This often occurs when the “particle” size of data stored (written) to the packet memory is the same size as the data retrieved (read) from the memory. However, this approach is not very “user friendly” with typical network traffic, where the size of a packet varies. Resolving coherency of the packet data at a common particle size may require multiple interrogations until the overall amount of requested data is satisfied. Such an approach may further result in lengthy resolutions, thus stalling resource utilization of the intermediate node.
For example, multiple data packets differing in sizes from a service queue may be stored (i.e., enqueued) on a write buffer before being written to a packet memory. These packets may be interspersed in the write buffer among other packets of differing sizes from other service queues. As a function of quality of service (QoS), an operation issued by a requestor to retrieve data packets for a service queue may request a plurality (e.g., thousands) of bytes from the packet memory system, where a sizable portion of the requested data is located on the write buffer, enveloped among many interspersed packets. Accordingly, certain data packets may need to be retrieved (i.e., dequeued) from the write buffer to satisfy the request.
If the particle size of the packet memory is relatively small, e.g., 32 bytes, a conventional coherency state machine would need to interrogate the contents of the write buffer hundreds of times, each interrogation occurring at a 32-byte particle size granularity, until all of the requested dequeue data is satisfied. While it may take a single cycle to transfer 32 bytes of data during normal operation, resolving coherency for each 32-byte particle of data in the write buffer can take many cycles, which may adversely impact the output rate of the service queue. Until the entire coherency resolution is complete, the latency associated with the service queue can block dequeue operations of other packets on other service queues located behind that queue.
One solution to this approach is to “manually flush” the entire contents of the write buffer to packet memory, retrieve the requested data from the memory and then relay that data to the requestor. However, if the size of the write buffer is relatively large, this solution may take a long time, which could degrade performance of the system. The solution may be extended to store write commands up to a predetermined threshold before flushing those commands. Once the commands are flushed, the corresponding data from the write buffer has already been written to memory. This is similar to waiting for the write buffer to flush itself naturally. Another solution is to perform full lookup operations on a particle size, e.g., 32-byte, basis; however, this latter solution also incurs latency, which may degrade performance.