Conventional network processor units (NPU) may be interfaced to integrated IP coprocessors (IIPC) in a manner that enables both SRAMs and IIPCs to be operated on the same memory mapped bus. As illustrated by FIG. 1, a conventional IIPC 30 may be coupled through a standard memory mapped interface to an NPU 10, which operates as a command source. The address bits ADDR[23-22] represent a two-bit select field that identifies one of four possible IIPCs on the bus for which a read operation is directed. The NPU 10 may include an SRAM controller that is based on FIFO communication. The SRAM controller includes internal bus control state machines 20 and pin control state machines 14. Data and address information is transferred between these state machines using push and pull data FIFOs 12a and 12d and read and write command FIFOs 12b and 12c that supply read and write addresses to the pin control state machines 14.
The IIPC 30 is illustrated as including a content addressable memory (CAM) core 36 and logic 38 that couples the CAM core 36 to the memory mapped interface. This memory mapped interface is illustrated as including read control logic 32 and write control logic 34. The write control logic 34 is configured to receive an address ADDR[21:0], a write enable signal WE_N[1:0], input data DATAIN[15:0] and input parameters PARIN[1:0]. The read control logic 32 is configured to receive the address ADDR[21:0] and a read enable signal RE_N[1:0] and generate output data DATAOUT[15:0] and output parameters PAROUT [1:0]. Like the SRAM controller within the NPU 10, this memory mapped interface is based on FIFO communication. The IIPC 30 performs operations using the input data DATAIN[15:0] and input parameters PARIN[1:0] and then passes back result values to the NPU 10. The timing between the receipt of the input parameters and the return of the corresponding result values is not fixed. Instead, it is determined by the amount of time the IIPC 30 requires to execute the specified instruction and depends on the number and type of other instructions currently pending within the IIPC 30.
These pending instructions are initially logged into respective instruction control registers 50 that support a plurality of separate contexts (shown as a maximum of 128). These instructions may be processed in a pipelined manner. The result values generated at the completion of each context are provided to respective result mailboxes 40. The validity of the result values within the mailboxes 40 is identified by the status of the done bit within each result mailbox 40. Accordingly, if a read operation is performed before the result values are ready, the NPU 10 will be able to check the validity of the done bit associated with each set of result values to determine whether the corresponding values of valid. However, because there can be multiple contexts in progress within the IIPC 30 at any given time and because the completion of the contexts does not necessarily occur in the same sequence as the requests were made, the NPU 10 may need to regularly poll the result mailboxes 40 at relatively high frequency to obtain new results as they become valid. Unfortunately, such regular polling can consume a substantial amount of the bandwidth of instructions that are issued to the IIPC 30 and lead to relatively high levels of operational inefficiency when the IIPC 30 is running a large number of contexts. Thus, notwithstanding the IIPC 30 of FIG. 1, which is capable of supporting a large number of contexts, there continues to be need for more efficient ways to communicate result status information from an IIPC to a command source, such as an NPU.
Referring now to FIG. 2A, another conventional IIPC 300 includes a memory mapped interface 302 having a write interface 304 and a read interface 306 therein. These write and read interfaces 304 and 306 may be configured as quad data rate interfaces that communicate to and from a command source (e.g., ASIC or NPU) having a compatible interface. A clock generator circuit 308 may also be provided that is responsive to an external clock EXTCLK. This clock generator circuit 308 may include delay and/or phase locked loop integrated circuits that operate to synchronize internal clocks within the IIPC 300 with the external clock EXTCLK. A reset circuit 310, which is configured to support reset and/or power-up operations, is responsive to a reset signal RST. Context sensitive logic 312 may support the processing of multiple contexts. The context sensitive logic 312 may include an instruction memory 316 that receives instructions from the write interface 304 and a results mailbox 314 that may be accessed via the read interface 306. The instruction memory 316 may be configured as a FIFO memory device. The results mailbox 314 is a context specific location where the IIPC 300 places results returned from a previously issued command.
The internal CAM core 330 is illustrated as a ternary CAM core that contains a data array and a mask array 328. This CAM core 330 may be configurable into a plurality of independently searchable databases. General and database configuration registers 318 are also provided along with global mask registers GMRs 320. These registers provide data to instruction loading and execution logic 332, which may operate as a finite state machine (FSM). The instruction loading and execution logic 332 communicates with the CAM core 330 and the result logic 334. If the IIPC 300 is configured to support a depth-cascaded mode of operation, a cascade interface 338 may be provided for passing data and results to (and from) another IIPC (not shown). The instruction loading and execution logic 332 may also pass data to and from an external memory device, via an SRAM interface 336. IIPC 300 may include an aging logic 321 that automatically removes stale entries from an internal CAM core 330. The aging logic 321 is illustrated as including two memory arrays: an age enable array 322 and an age activity array 324. These memory arrays may have bit positions that map directly to entries within the CAM core 330.
The CAM core 330 (and other CAM cores in other IIPCs depth cascaded with the IIPC 300) are partitioned into segments (or blocks). Individual segments or groups of segments may be allocated, for example, to various databases, such as search tables associated with various packet headers or other packet content. In the conventional IIPC 300, search results are generated in the form of absolute indices which provide information on the device (i.e., an identifier of an NSE in a search machine comprising plurality of depth-cascaded NSEs), segment, and segment offset of a match to a particular search key, as shown in FIG. 2B. These absolute indices may be provided to the results mailbox 314 for use by, for example, an NPU. Absolute indices may also be provided to the SRAM interface 336, where they may be used as addresses for accessing associated data (e.g., next hop addresses) in an external SRAM.