The present application relates to a reception packet processor, and more specifically to a protocol processor for processing in one clock cycle first header information of a reception packet to provide by the end of the one clock cycle selected instructions for processing second header information.
Computer and data communication have been one of the dominant areas in the electronics market. Computer and data communication is based on packet processing and routing. In all packet-based communications, packet processing is defined to handle incoming packets, including de-framing, field search, field extraction, and payload handling. At one side, the packet processes are performed either in terminals on layer 2-4 on the ISO-OSI reference model or in routers on layer 2-3. At another side, the packet processing is also required on application layers above TCP/UDP, for example the MPEG packet processing. In general, three kinds of processes are handled by a communication system in the baseband: channel and packet processing, data processing, and voice/image processing. Therefore, packet processing generally is recognized as one of the most important activities in computer and communication industries.
Traditionally this processing has been implemented by fixed function application specific integrated circuits (“ASICs”) and programmable general purpose processors. The ASICs typically handle layer 2 (e.g. Ethernet) functionality, while the general purpose processors handle layer 3 and 4 (e.g. TCP/IP).
As bit rates on the communication networks increase to several Gigabits per second and protocols keep evolving, these traditional implementations fail to serve adequately as processing resources. The fixed function ASICs cannot handle updating of protocol standards and the programmable general-purpose processors cannot keep up with the speed requirement. A new concept is to make domain-specific protocol processors, which are flexible within the protocol processing area and are still fast enough.
Another bottleneck in the operation of communication terminals exists. Low power consumption is required for a network terminal (NT) connected to a high speed network. The conflict of the high speed network and low speed payload process cannot be fixed by a general purpose processor. Therefore, a protocol processor is necessary to separate the protocol processing and the processing of the payload.
Since 1999, some new concepts for packet reception processing have been presented. Coresma, Agere, C-Port and Intel all have presented processors for this task. No common terminology has developed so far and network processors, protocol processors, and pattern processors can all be found in the literature. All processors from the companies mentioned include more functionality than the packet reception processing, for example packet switching and packet compiling. Also, all processors mentioned above are based on general purpose CPU with protocol processing adaptations. Obviously, protocol processing is not deeply optimized and no solution is the best for network terminals.
Since packet reception processing is an integral part of portable battery-driven network terminals, low-power consumption, small silicon area, and minimum process delay time are essential. This also is important for network infrastructures because several packet reception processing units can be placed on the same chip in a switch. Considering this fact a small silicon area is required as well. To avoid buffering an incoming packet, which creates delay and uses memory, on-the-fly operation is highly desired. However, true on-the-fly processing is very hard to achieve because of the required flexibility. The implementation must be able to adapt to several layer (2-4 of OSI) protocols and future versions. Also application layer protocols (e.g. RTP or MPEG packets) should be considered. This means that implementing true on-the-fly package reception processing could be very complex and the requirements of low-power consumption and small silicon area cannot be fulfilled. Hence there is a need for an “almost” on-the-fly processing implementation.
The largest problem when trying to fulfill all the requirements is to find a hardware architecture that can perform the processing on-the-fly. As already stated, true on-the-fly processing is too expensive to achieve when flexibility is demanded. Instead, pseudo-on-the-fly processing is required, that is, the processor is allowed to delay some tasks for some clock-cycles if the tasks can be performed later on. This is necessary when many small header fields are present and the processor parallelism simply is not enough to take care of them all at one time.
Another problem arises from the fact that, in packet reception processing, conditional jumps and case based jumps are frequently used. To succeed with pseudo-on the-fly processing, the consumed clock-cycles must be minimal and the same, independent of whether jumps are taken or not.
A further problem arises from the fact that in some protocols, very long fields have to be compared using several values. Having huge comparators is not acceptable, since the delay is too long and they would have a negative impact on the silicon area.