The present invention generally relates to multiprocessor systems and, more specifically, to out-of-order data processing by processors of a
A systolic array provides a common approach for increasing processing capacity of a computer system when a problem can be partitioned into discrete units of works. In the case of a one dimensional systolic array comprising a single xe2x80x9crowxe2x80x9d of processing elements or processors, each processor in the array is responsible for executing a distinct set of instructions on input data before passing it to a next element of the array. To maximize throughput, the problem is divided such that each processor requires approximately the same amount time to complete its portion of the work. In this way, new input data can be xe2x80x9cpipelinedxe2x80x9d into the array at a rate equivalent to the processing time of each processor, with as many units of input data being processed in parallel as there are processors in the array. Performance can be improved by adding more elements to the array as long as the problem can continue to be divided into smaller units of work. Once this dividing limit has been reached, processing capacity may be further increased by configuring multiple rows in parallel, with new input data allocated to the first processor of a next row of the array in sequence.
A symmetric multiprocessor system configured as a systolic array typically guarantees first in, first out (FIFO) ordering of context data processing. As used herein, context data or xe2x80x9ccontextxe2x80x9d is defined as an entire packet or, preferably, a header of a packet. According to FIFO ordering, the contexts processed by the processors of the rows must complete in the order received by the processors before the rows of the array advance. Each processor is allocated a predetermined time interval or xe2x80x9cphasexe2x80x9d within which to complete its processing of a context; when each processor completes its context processing within the phase, this control mechanism is sufficient. However if a processor stalls or otherwise cannot complete its processing within the phase interval, all processors of the array stall in order to maintain FIFO ordering. Here, the FIFO ordering control mechanism penalizes both the processors of the row of the stalled processor and the processors of the remaining rows of the multiprocessor array. For most applications executed by the array, FIFO ordering is not necessary.
However, FIFO ordering may be needed to maintain an order of contexts having a dependency among one another; an example of a mechanism used to identify dependencies is a xe2x80x9cflowxe2x80x9d. A flow is defined as a sequence of packets having the same layer 3 (e.g., Internet Protocol) source and destination addresses, and the same layer 4 (e.g., Transport Control Protocol) port numbers. In addition, the packets of a flow also typically have the same protocol value. The present invention is generally directed to a mechanism that enables selective FIFO ordering of contexts processed by a symmetic multiprocessor system.
The present invention comprises a sequence control mechanism that enables out-of-order processing of contexts by processors of a symmetric multiprocessor system having a plurality of processors arrayed as a processing engine. The processors of the engine are preferably arrayed as a plurality of rows or clusters embedded between input and output buffers, wherein each cluster of processors is configured to process contexts in a first in, first out (FIFO) synchronization order. According to the invention, however, the sequence control mechanism allows out-of-order context processing among the clusters of processors, while selectively enforcing FIFO synchronization ordering among those clusters on an as needed basis, i.e., for certain contexts.
In the illustrative embodiment, the control mechanism comprises an input sequence controller coupled to the input buffer and an output sequence controller coupled to the output buffer. Each context contains a queue identifier (ID) that uniquely identifies a flow of the context and a sequence number that denotes an order of the context within the flow. A minimum sequence number is used to enforce ordering within a flow having a common queue ID and, to that end, refects the lowest sequence number of a context for a flow that is active in the processing engine. Synchronization logic of the controllers maintains the minimum (lowest) sequence number for each active flow
Broadly stated, out-of-order context processing among the clusters is allowed for contexts having different queue IDs, while FIFO synchronization is enforced among the clusters for contexts having the same queue ID. Ordering of contexts associated with a flow is enforced at the output buffer using the queue ID, sequence number and minimum sequence number information associated with each context. That is, the sequence controllers use the information to maintain FIFO synchronization throughout the processing engine, i.e, from the input buffer to the output buffer, for those contexts belonging to a flow and, thus, having the same queue ID.
Advantageously, the inventive sequence control mechanism reduces undesired processing delays among the processors of the arrayed processing engine. Use of the queue ID, sequence number and minimum sequence number information enables the input sequence controller to issue contexts of the same flow to any cluster, thereby exploiting the parallelism inherent in the arrayed processing engine. In addition, the information is used by the output sequence controller to ensure that transmission of a previous context from (off) the processing engine occurs prior to transmission of a subsequent context associated with that flow.