1. Field of the Invention
The present invention relates to sharing use of descriptors for memory among multiple devices on the same data bus, and in particular to synchronizing the use of descriptors for memory buffers between a direct memory access (DMA) circuit block and a programmable processor.
2. Description of the Related Art
Networks of general purpose computer systems connected by external communication links are well known. The networks often include one or more network devices that facilitate the passage of information between the computer systems. A network node is a network device or computer system connected by the communication links.
Routers and switches are network devices that determine which communication link or links to employ to support the progress of data packets through the network. Routers and switches can employ software executed by a general purpose processor, called a central processing unit (CPU), or can employ special purpose hardware, or can employ some combination to make these determinations and forward the data packets from one communication link to another.
While the use of hardware causes data packets to be processed extremely quickly, there are drawbacks in flexibility. As communications protocols evolve through subsequent versions and as new protocols emerge, the network devices that rely on hardware become obsolete and have to ignore the new protocols or else be replaced. As a consequence, many network devices, such as routers, which forward packets across heterogeneous data link networks, include a CPU that operates according to an instruction set (software) that can be modified as protocols change. The software set constitutes a networking operating system, such as the Cisco Internet Operating System (IOS) available from Cisco Systems of San Jose, Calif.
Software executed operations in a CPU proceed more slowly than hardware executed operations, so there is a tradeoff between flexibility and speed in the design and implementation of network devices.
In many current routers, the data of a data packet are streamed to several memory buffers reserved for such use. A direct memory access (DMA) controller circuit block on any network interface and one or more CPUs in the router use these memory buffers to store, retrieve, use, and possibly modify the data packet after it is received on one link and before it is forwarded on the same or different link. To avoid collisions among the various components that use these buffers only one user at a time is given ownership of a buffer, e.g., an exclusive right to write to the buffer. In many routers this and other buffer management functions are supported by descriptor records (called descriptors herein) stored in a portion of local memory.
Each DMA block on a corresponding network interface is allocated a set of descriptors (sometimes divided into ingress descriptors for data packets being received and egress descriptor records for data packets being transmitted). Each descriptor indicates the location of a buffer in memory and indicates whether the DMA of that network interface or the CPU currently owns the descriptor and the corresponding buffer. Other information is also stored in each descriptor. Typically, the descriptors allocated to a DMA are arranged in one or more circular queues called rings, so that descriptors can be used in order and automatically cycle to the beginning of the portion of memory allocated to descriptors after reaching the end. The CPU assigns a free buffer location to ingress descriptors from a pool of free descriptors. The CPU moves descriptors from ingress rings to egress rings upon processing the received packet and the determination of the egress port. Buffers cleared after transmission of their data packet are collected in a buffer pool and reassigned to ingress descriptor rings of individual DMAs as appropriated.
When one stage of processing is completed, the status of a buffer is updated by writing to the associated one or more descriptors an update in which the ownership is changed to the next user of the data buffer. The DMA block includes a head pointer register that indicates a position in the descriptor ring where the next descriptor is to be written. For example an ingress head pointer register indicates a position in the ingress descriptor ring where is located a descriptor associated with a buffer for a newest data packet received by the network interface. The owner of this descriptor is changed to CPU during this update. Conversely, an egress head pointer register indicates a position in the egress descriptor ring where a descriptor, associated with a buffer for a newest data packet to be transmitted by the network interface, is updated with the buffer location where that data packet is already stored. In this case, the descriptor is updated by the CPU and the owner is changed to the DMA during the update. The DMA head pointer advances by one (including cycling automatically from the end of the allocated memory portion to the beginning) as each descriptor is updated with a new owner.
The next descriptor to be read is indicated by a tail pointer. A tail pointer is usually maintained by the CPU for ingress descriptor rings; in some routers, the DMA block also includes a tail pointer register. For example, the next descriptor to be read by the CPU on the ingress ring is indicated by an ingress tail pointer; and the next descriptor to be read by the DMA on the egress ring is indicated by an egress tail pointer.
When the CPU is ready to process data packets for one of the network interfaces (e.g., by reading descriptors on the ingress ring or writing descriptors on the egress ring), the CPU requests the value of the head pointer from the DMA. The DMA responds with the value of the head pointer. The CPU then processes a number of buffers based on the head pointer value returned from the DMA. For example, the difference between the head pointer returned by the DMA and the CPU's copy of the tail pointer is computed to determine the number of descriptors of this ingress ring that the CPU is to process.
While suitable for many cases, the above approach suffers deficiencies when there is a rapid rate of data packets being received or transmitted at one network interface. This problem arises because there is a latency in the access to memory operations used by the DMA and CPU. This latency can lead to a discrepancy between the value in the head pointer and the ownership of the descriptors returned in response to a read command.
Consider this example. On a link supported by a particular DMA block, a hundred data packets arrive of sizes sufficient to average four buffers per packet. The DMA reads the head pointer, retrieves the descriptor indicated, finds the associated buffer, fills that buffer, updates the descriptor, changing ownership to the CPU to further process the buffer, increments the head pointer, and then retrieves the next descriptor, repeating these steps 400 times. A queue of memory commands stack up at the DMA. Each command on the queue is sent in turn over a data communication channel to a memory device. A bus controller shares the bus among multiple DMA blocks on corresponding network interfaces, and may deny a DMA block access to the bus for some time as other components employ the bus. When the memory device receives the command, it responds by updating a portion of memory. The time from placing the memory command on the queue until the portion of memory is updated and visible for subsequent reads is the latency of the operation.
During these 400 operations, the CPU requests the head pointer value. The header value indicates 400 new descriptors past the tail pointer. The DMA responds to this CPU request at a high priority, bypassing commands in the queue. The CPU processes 400 descriptors on the ingress descriptor ring. However, due to latency, only 300 have been updated by the time the CPU reads the 301st descriptor. This descriptor has not been updated and the CPU should not use the buffer pointed to by this descriptor.
In one approach to resolve the discrepancy, the CPU checks the ownership of each descriptor before using it. Thus, in the example, the CPU will see that the 301st descriptor is still owned by the DMA block. The CPU will stop processing data packets from this interface, and move to processing data packets for another interface.
A disadvantage of this approach is that CPU processing is wasted as the CPU checks the ownership of every descriptor. Further waste is involved as overhead functions are executed to close down processing of a packet at mid-packet, and starting or restarting a different packet, also possibly mid-packet. Further waste of CPU and bus resources occur as the CPU polls the DMA or descriptor ring to determine when the descriptors to be processed are available. The result is perceptibly poor performance.
In another approach, the DMA responds to a request for the head pointer by placing a command in the queue, so the head pointer value is sent after the memory commands already in the queue.
A disadvantage of this approach is that it also wastes CPU resources. The CPU could process the first 300 buffers immediately but is forced to wait and do nothing with these buffers until all 400 are ready. There is also no guarantee that its position on the queue is on a packet boundary, so after waiting for those last 100 memory commands, the CPU is still faced with a mid-packet end of data; and, must expend resources to close down processing mid-packet for restarting at a later time. Further, there is no guarantee that the command issued before the response with the head pointer value will be completed before the CPU reads the associated descriptor, so the CPU might still consume resources checking the ownership of the descriptors.
Based on the foregoing, there is a clear need for techniques to synchronize the use of descriptors for memory buffers that do not suffer all the deficiencies of prior approaches. In particular, there is a need to guarantee that all descriptors read by a user up to a most recent descriptor indicated for that user are owned by that user. There is further particular need to guarantee that the descriptors being read end on a packet boundary to avoid the overhead of closing down processing mid-packet.
The approaches described in this section could be pursued, but are not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated herein, the approaches described in this section are not to be considered prior art to the claims in this application merely due to the presence of these approaches in this background section.