1. Field of the Invention
The present invention relates to computer systems and specifically to controlling access to shared resources in a computer system.
2. Background Information
Computer architecture generally defines the functional operation, including the flow of information and control, among individual hardware units of a computer. One such hardware unit is the processor or processing engine, which contains arithmetic and logic processing circuits organized as a set of data paths. In some implementations, the data path circuits may be configured as a central processing unit (CPU) having operations that are defined by a set of instructions. The instructions are typically stored in an instruction memory and specify a set of hardware functions that are available on the CPU.
A high-performance computer may be realized by using a number of CPUs or processors to perform certain tasks in parallel. For a purely parallel multiprocessor architecture, each processor may have shared or private access to resources, such as program instructions (e.g., algorithms) or data structures stored in a memory coupled to the processors. Access to an external memory is generally handled by a memory controller, which accepts memory requests from the various processors and processes them in an order that often is controlled by logic contained in the memory controller. Moreover, certain complex multiprocessor systems may employ many memory controllers where each controller is attached to a separate external memory subsystem.
One place where a parallel, multiprocessor architecture can be advantageously employed involves the area of data communications and, in particular, the forwarding engine for an intermediate network station or node. An intermediate node interconnects communication links and subnetworks of a computer network through a series of ports to enable the exchange of data between two or more software entities executing on hardware platforms, such as end nodes. The nodes typically communicate by exchanging discrete packets or frames of data according to predefined protocols, such as the Transmission Control Protocol/Internet Protocol (TCP/IP) or the Internetwork Packet Exchange (IPX) protocol. The forwarding engine is often used by the intermediate node to process packets received on the various ports. This processing may include determining the destination of a packet, such as an output port, and placing the packet on an output queue associated with the destination.
Intermediate nodes often employ output queues to control the flow of packets placed into the network. In a typical arrangement, the output queues are configured as first-in-first-out (FIFO) queues where packets are placed (enqueued) at the end (tail) of the queues and removed (dequeued) from the beginning (head) of the queue. Placement and removal often entails accessing the queue, which includes writing and reading the packet or information related to the packet, such as a packet header, to and from the queue.
In some systems, packets are enqueued and dequeued by the forwarding engine. In intermediate nodes that employ forwarding engines containing multiple processors, the output queues may be treated as shared resources, meaning that more than one processor can access a given queue at a given time. One problem with shared resources, however, is that certain race conditions may occur when two or more processors attempt to perform conflicting operations on the same resource at the same time. For example, a race condition may occur when a shared queue is empty and a first processor begins to enqueue an element (e.g., a packet header) onto the queue while a second processor accesses the same queue and attempts to dequeue the same element. If the first processor has not completely placed the element on the queue when the second processor begins to dequeue the element, the second processor may end up dequeuing an incomplete element. Another race condition may occur when a shared queue is full and a first processor begins to dequeue an element while a second processor attempts to enqueue an element onto the same queue before the first processor has completely dequeued its element. If the first processor has not completely removed the element from the queue before the second processor begins to place its element on the queue, the second processor may end up overwriting the element being dequeued by the first processor and thus the first processor may end up removing erroneous information.
A prior technique that may be used to avoid race conditions associated with accessing shared resources in a multiprocessing system involves a lock. A lock is an abstraction representing permission to access the resource. Typically, when an entity, such as a processor, wishes to access the shared resource, it obtains “permission” by acquiring the lock before accessing the resource. When the entity finishes accessing the resource the entity releases the lock so that other entities may obtain permission to access the resource. By requiring that the lock be acquired by an entity before the resource is accessed, entities that do not acquire the lock are prevented (locked-out) from interfering with an entity that has acquired the lock.
One problem with locks is that they tend to “serialize” access to resources. This may be troublesome in parallel processing systems, such as multiprocessor systems, where the benefits associated with parallel processing may be greatly diminished due to the serial nature of the locking mechanism. For example, if a processor must wait until another processor releases a lock before it proceeds, the time spent waiting for the lock is time wasted that the processor could have used to perform other useful (parallel) work. Thus, in certain systems, especially parallel processing systems, locking mechanisms may not represent an efficient way to control access to a shared resource.