Conventional network processor units (NPU) may be interfaced to integrated IP coprocessors (IIPC) in a manner that enables both SRAMs and IIPCs to be operated on the same memory mapped bus. As illustrated by FIG. 1, a conventional IIPC 30 may be coupled through a standard memory mapped interface to an NPU 10, which operates as a command source. The address bits ADDR[23:22] represent a two-bit select field that identifies one of four possible IIPCs on the bus for which a read operation is directed. The NPU 10 may include an SRAM controller that is based on FIFO communication. The SRAM controller includes internal bus control state machines 20 and pin control state machines 14. Data and address information is transferred between these state machines using push and pull data FIFOs 12a and 12d and read and write command FIFOs 12b and 12c that supply read and write addresses to the pin control state machines 14.
The IIPC 30 is illustrated as including a content addressable memory (CAM) core 36 and logic 38 that couples the CAM core 36 to the memory mapped interface. This memory mapped interface is illustrated as including read control logic 32 and write control logic 34. The write control logic 34 is configured to receive an address ADDR[21:0], a write enable signal WE_N[1:0], input data DATAIN[15:0] and input parameters PARIN[1:0]. The read control logic 32 is configured to receive the address ADDR[21:0] and a read enable signal RE_N[1:0] and generate output data DATAOUT[15:0] and output parameters PAROUT[1:0]. Like the SRAM controller within the NPU 10, this memory mapped interface is based on FIFO communication. The IIPC 30 performs operations using the input data DATAIN[15:0] and input parameters PARIN[1:0] and then passes back result values to the NPU 10. The timing between the receipt of the input parameters and the return of the corresponding result values is not fixed. Instead, it is determined by the amount of time the IIPC 30 requires to execute the specified instruction and depends on the number and type of other instructions currently pending within the IIPC 30.
These pending instructions are initially logged into respective instruction control registers 50 that support a plurality of separate contexts (shown as a maximum of 128). These instructions may be processed in a pipelined manner. The result values generated at the completion of each context are provided to respective result mailboxes 40. The validity of the result values within the mailboxes 40 is identified by the status of the done bit within each result mailbox 40. Accordingly, if a read operation is performed before the result values are ready, the NPU 10 will be able to check the validity of the done bit associated with each set of result values to determine whether the corresponding values of valid. However, because there can be multiple contexts in progress within the IIPC 30 at any given time and because the completion of the contexts does not necessarily occur in the same sequence as the requests were made, the NPU 10 may need to regularly poll the result mailboxes 40 at relatively high frequency to obtain new results as they become valid. Unfortunately, such regular polling can consume a substantial amount of the bandwidth of instructions that are issued to the IIPC 30 and lead to relatively high levels of operational inefficiency when the IIPC 30 is running a large number of contexts. Thus, notwithstanding the IIPC 30 of FIG. 1, which is capable of supporting a large number of contexts, there continues to be need for more efficient ways to communicate result status information from an IIPC to a command source, such as an NPU.
Referring now to FIG. 2A, another conventional IIPC 300 may include an aging feature that automatically removes stale entries from an internal CAM core 330. This aging feature can be operated as a fully independent hardware function requiring no software intervention or as a software-managed procedure with hardware assist. The IIPC 300 of FIG. 2A includes a memory mapped interface 302 having a write interface 304 and a read interface 306 therein. These write and read interfaces 304 and 306 may be configured as quad data rate interfaces that communicate to and from a command source (e.g., ASIC or NPU) having a compatible interface. A clock generator circuit 308 may also be provided that is responsive to an external clock EXTCLK. This clock generator circuit 308 may include delay and/or phase locked loop integrated circuits that operate to synchronize internal clocks within the IIPC 300 with the external clock EXTCLK. A reset circuit 310, which is configured to support reset and/or power-up operations, is responsive to a reset signal RST. Context sensitive logic 312 may support the processing of multiple contexts. The context sensitive logic 312 may include an instruction memory 316 that receives instructions from the write interface 304 and a results mailbox 314 that may be accessed via the read interface 306. The instruction memory 316 may be configured as a FIFO memory device. The results mailbox 314 is a context specific location where the IIPC 300 places results returned from a previously issued command.
The internal CAM core 330 is illustrated as a ternary CAM core that contains a data array and a mask array 328. This CAM core 330 may be configurable into a plurality of independently searchable databases. General and database configuration registers 318 are also provided along with global mask registers GMRs 320. These registers provide data to instruction loading and execution logic 332, which may operate as a finite state machine (FSM). The instruction loading and execution logic 332 communicates with the CAM core 330 and the result logic 334. If the IIPC 300 is configured to support a depth-cascaded mode of operation, a cascade interface 338 may be provided for passing data and results to (and from) another IIPC (not shown). The instruction loading and execution logic 332 may also pass data to and from an external memory device, via an SRAM interface 336.
The aging logic 321 is illustrated as including two memory arrays: an age enable array 322 and an age activity array 324. These memory arrays may have bit positions that map directly to entries within the CAM core 330. Thus, if the CAM core 330 has 128 k entries (e.g., x72 entries), then the age enable array 322 and age activity array 324 may each have a capacity of 128 k bits. The illustrated aging logic 321 may operate with the instruction loading and execution logic 332 to (i) reset age activity bits that have been previously set within the age activity array 324 in response to successful search operations and (ii) age out entries associated with previously reset activity bits by invalidating the corresponding entries.
The aging operations may include periodically inserting an aging instruction into an instruction pipeline within the IIPC 300. As illustrated by the global and database aging request circuit 350 of FIG. 2B, a global aging register 352 (e.g., 32-bit countdown counter) may be used to specify the number of cycles of a system clock SYSCLK that are to occur before each aging operation request is inserted into the instruction pipeline. Each aging operation that is inserted may operate to age one entry within a database that is programmed to support aging. Each database within the CAM core 330 may have an individually specified time period for aging, which means the frequency of the age service requests for the plurality of databases (shown as DB0–DB15) may be independently controlled. These time periods may be specified by a plurality of 24-bit countdown counters 356 that are set to database specific time constants (i.e., count values) and clocked at 1/256th the system clock frequency. This slower clocking rate may be achieved with a divide-by-8 circuit 354 that is responsive to the system clock SYCCLK. As long as a database is enabled for aging, a database age service request is issued every time the corresponding 24-bit countdown counter 356 decrements to zero and is reinitialized. The IIPC 300 determines which database is to be serviced during each aging operation using a round-robin arbitration of all pending database age service requests. One entry within a selected database is aged in response to a selected age service request. The aging of a selected entry proceeds as follows. If a corresponding age enable bit for the entry is set to 0 within the age enable array 322, then the aging operation does nothing because the entry is not subject to aging. If the age enable bit is set to 1 within the age enable array 322 and a corresponding age activity bit is set to 1 (i.e., the entry is active) within the age activity array 324, then the aging operation clears (i.e., resets) the age activity bit to 0. Finally, if the age enable bit is set to 1 within the age enable array 322 and the corresponding age activity bit is set to 0 (i.e., the entry is inactive), then the aging operation removes the entry from the selected database by marking the entry as invalid (e.g., sets the valid bit associated with the entry in the CAM core 330 to an invalid state). The activity bit associated with an entry can be set to 1 whenever the entry is originally written into the CAM core 330 or a search operation results in a hit for the corresponding entry. A learn instruction and a set valid instruction may also operate to set an activity bit associated with a corresponding entry.
Conventional CAM-based search engine devices may also be configured to perform search and learn (SNL) operations that include an initial search operation and, if necessary, a subsequent learn operation of a new entry into a CAM core. As illustrated by FIG. 5 of U.S. Pat. No. 6,219,748, operations may be performed to decode a learn instruction and load comparand data (e.g., search key) into a comparand register within a search engine device. An operation is then performed to search a CAM array using the comparand data as a search word. If a match occurs in response to the search operation, then operations associated with generating conventional search results are performed. However, if no match is present and the CAM array is not full, then the comparand may be written as a CAM entry into an internally generated next free address within the CAM array. This write operation into a next free address is treated as a learn operation as opposed to a conventional write operation that includes an externally supplied write address. The next free address associated with the newly written CAM entry is then reported to a command host. Related CAM-based search engine devices that describe learn operations are also disclosed in U.S. Pat. Nos. 6,148,364 and 6,240,485.