1. Field of the Invention
The present invention relates to a method and an apparatus for improving data processing in data networks. More particularly, the present invention enables wire-speed multi-protocol data processing using a single machine at a single point in the network.
2. Description of the Related Art
Computer networks necessitate the provision of various communication protocols to transmit and receive data between the network elements. The enormous expansion in computer based data communications has resulted in a growing number of communications protocols that are used for enabling the various types and forms of communication. When a plurality of processors are connected to form a network for exchanging data, a set of procedural rules are set forth as the communication protocols. Every networked processing entity must follow this set of procedural rules, i.e., the protocol, which serves as the common language between these processors for communication with each other.
There are no known current technologies on the market that enable a simple and cost effective means for narrowing of the gap between the multi-Gigabit speeds of modern data transmission and the multi-Megabit speed of modem data processors, thereby allowing standard processors to process incoming data packets at wire-speed. As a result, traditional multi-protocol processing apparatus in unable to achieve processing speed of data packets that compares to the speed of the data packet transfers. Consequently the substantial improvements in communications infrastructures go largely unfelt, as the data is still typically processed at slower speeds. Each network device typically contains a combination of hardware and software that translate protocols and enable processing of data. For example, the subject medium of a communication may be voice, data, video, multi-media or facsimile, and such a communication may be received or transmitted by digital telephones, analog telephones, cellular telephones, computers, facsimile machines, etc. One drawback to the dramatic telecommunication advances is that the different protocols used are not naturally compatible with each other.
There are two current competitive technologies that attempt to improve networking that supports cross protocol communications.
The first technology performs the real time critical processing on a series of regular computers (for example PC, Unix Work Station, Mainframe, etc.) in parallel. In this case, each computer runs the full general IP protocol stack, and special (typically proprietary) apparatus supports load balancing between the processors. The advantage of this technology type is that general processors and general software stacks are used. The disadvantage is that too many processors are needed to fill the substantial processing gap (100-1000 times), and the load balancing requirements become increasingly complicated. Examples of such systems are those such as Shasta by Nortel (http://www.nortelnetworks.com), and CoSine IP Service Delivery Platform by Cosine (http://www.cosinecom.com).
In this case the processing of additional protocols can be facilitated, since there is typically much instruction memory available. However, since the general Computer Stations s are relatively generic devices that are able to do multi-functions, they are generally considered to be too slow to be used for dedicated network processing.
The second technology that deals with solutions for this problem is based on processing some protocols on dedicated Application Specific Integrated Circuits (ASICs). ASIC chips are custom designed for a specific application rather than a general-purpose chip such as a microprocessor. The use of ASICs improves performance over general-purpose CPUs, because ASICs are “hardwired” to do a specific job and do not incur the overhead of fetching and interpreting stored instructions. The disadvantage of such technology is that the number of protocols is continually growing, and therefore ASICs are required to be updated in order to support any new protocols or any new features in existing protocols. Such adaptations require near full ASIC re-design cycles. A consequence is that the weight of each protocol in the whole scheme of communications traffic is regularly changed. Since such adaptations require additional programming, pure hardware implementation (ASIC) is impossible, or at least ineffective. Some kind of CPU core for actions is required, and the use of several chips (general purpose CPU+ASIC) poses serious synchronization problems. A further problem with pure-hardware implementations is that even though implementation of protocols, such as TCP/IP in ASIC can be executed, it is almost impossible to do it bug free, and bug fixing in ASIC is extremely difficult.
Examples of such systems are the Nexsi 8000 Content Services System by Nexsi Systems (http://www.nexsi.com/), and the Celox SCx 192 by Celox (http://www.celoxnetworks.com/).
U.S. Pat. No. 6,034,963 describes a hardware-integrated system that both decodes multiple network protocols in a byte-streaming manner concurrently and processes packet data in one pass, thereby reducing system memory and form factor requirements, while also eliminating software CPU overhead. The '963 patent explicitly mentions displaying received data on different screens, i.e. it describes mainly interactive systems. The '963 patent is about TCP/IP support in silicone, and offers nothing for upper layers' protocols acceleration. This solution, therefore, is along the lines of the ASIC model, which is inflexible with regard to adaptation to new protocols.
Recent technological progress has introduced to the communications market a new kind of computer communications device, generally known as Network Processors (NP). Network Processor merge the advantages of both of the above-mentioned solutions, by typically including several dedicated processors with embedded parallel processing support, and software configurable look up accelerator support.
A Network Processor is therefore programmable, in contrast with ASICs, and it is furthermore designed for parallel processing, in contrast with general processors. In addition, NP provides specialized hardware accelerators, in contrast with general processors.
NP was originally developed to speed the processing of low-level network features like QoS, MPLS, Ipv6 and VLAN. NP, for example, bridges the hardware gap in wire speed processing. However, there are various software problems, associated with the NP. Since NP's architecture is very different from that of conventional processors, conventional programming methods do not allow for effective utilization of NP resources. Such a weakness typically prohibits NPs from effectively processing higher layers protocols, which were traditionally typically processed by software only. An additional Network Processor restriction is the limited instruction memory in NPs, which prohibits implementation of high-level network features.
Conventional protocol processing requires of a network programmer to describe in terms of some programming language three main items: data extraction rules (“take two bytes at offset 7”), data analysis rules (“check if the value of those bytes is 1234”), and action items (“if yes, turn on the red LED”). Each network protocol is a long chain of items such as the above, which is specific to a particular protocol. Such an approach, however, has the following disadvantages: a larger quantity of instruction memory is consumed; parallel processing is limited because the instructions must be executed in order, one at a time (unless there are a plurality of CPU cores); algorithm implementation effectiveness depends first and foremost on the programmer's skills; and hardware acceleration is almost impossible (except using more powerful processors) because no hardware can compensate for the lack of programming skills.
Network devices are typically setup using hardware to handle the Link Layer protocol, and software to handle the Network, Transport, and Communication Protocols, as well as information data handling. The network device normally implements the one Link Layer protocol in hardware, limiting the attached computer to only that particular LAN protocol. The higher protocols, e.g. Network, Transport, and Communication protocols, along with the Data handlers, are implemented as software programs which process the data once the data passes through the network device hardware into the system memory. However, for each additional protocol, the network device requires substantially more program memory resources in order to be processed. Therefore, dedicated network processing devices are often used to execute specialized network processing. These network processors are typically restricted to minimal processing memory and significant quantities of data memory, in order to rapidly execute specific tasks.
The disadvantage of this implementation, however, is that if processing of additional protocols is required, the system requires a high processor overhead, a large amount of program memory is required. In addition, the computer used to coordinate such processing of different software protocols and data handlers demands a complicated configuration setup. During normal operation, once the hardware verifies the transport or link layer protocol, the resulting data packet is sent to a software layer which determines the packets frame format and strips any specific frame headers. The packet is then sent to different protocol stacks where it is evaluated for the specific protocol. However, the packet may be sent to several protocols stacks before it is accepted or rejected. The time lag created by software protocol stacks prevent audio and video transmissions to be processed in real-time, and therefore typically require the data to be buffered before playback.
In order to deal with the processing power limitations of such cross-processing devices, and properly follow the communication protocol for setting up a communication session between two processors in a network, each of the communication protocols are often represented as a ‘finite state’ machine, being designed with the operational states required to solve specific problems. The operational states of these protocols are cataloged as ‘state tables’, wherein, depending on the operation conditions of each of the processors being used, each processor is identified as being in a specific state.
Most of the protocols typically used are stateful protocols, wherein data is parsed with reference to previous processing stages. There are a limited number of stateless protocols (parsing without reference to previous processing stages) that can be presented like stateful protocols, yet this can only be typically achieved with a single state, alternatively referred to as ‘one-state stateful protocols’. Examples of such a stateless protocols are IP as Layer 3 protocol, UDP on Layer 4, HTTP for WWW, SIP in VoIP protocol family etc.
Furthermore, existing protocols are often changed or become obsolete, while new protocols are created nearly every day. Each protocol is defined by its own specification, and is typically presented by the protocol specific state machines. The number of protocols in use makes it difficult to maintain wire-speed high level (protocol specific) processing of communication traffic, because nearly all protocols may be implemented in general processor based, low speed machines.
There is thus a widely recognized need for, and it would be highly advantageous to have, a generic way to simultaneously process a number of stateful protocols that belong to a same communication area (IP, voice decoding, VoATM, etc.) on a single high speed active component (machine), where this processing is enabled by software enhancements (at the micro-code or programming logic levels), and not by substantial hardware enhancements.