The present invention relates to broadband data networking equipment. Specifically, the present invention relates to a content processor that scans, classifies and modifies network traffic based on content.
The character and requirements of networks and networking hardware are changing dramatically as the demands on networks change. Not only is there an ever-increasing demand for more bandwidth, the nature of the traffic flowing on the networks is changing. With the demand for video and voice over the network in addition to data, end users and network providers alike are demanding that the network provide services such as quality-of-service (QoS), traffic metering, and enhanced security. However, the existing Internet Protocol (IP) networks were not designed to provide such services because of the limited information they contain about the nature of the data passing over them.
Existing network equipment that makes up the infrastructure was designed only to forward data through the network""s maze of switches and routers without any regard for the nature of the traffic. The equipment used in existing networks, such as routers, switches, and remote access servers (RAS), are not able to process any information in the network data stream beyond the packet headers and usually only the headers associated with a particular layer of the network or with a set of particular protocols. Inferences can be made about the type of traffic by the particular protocol, or by other information in the packet header such as address or port numbers, but high-level information about the nature of the traffic and the content of the traffic is impossible to discern at wire speeds.
In order to better understand packet processing and the deficiencies of existing network equipment it is helpful to have an understanding of its basic operation. The functionality of most network equipment can be broken down into four basic components. The first component is the physical layer interface (PHY layer) which converts an analog waveform transmitted over a physical medium such as copper wire pairs, coaxial cable, optical fiber, or air, into a bit stream which the network equipment can process, and vice versa. The PHY layer is the first or last piece of silicon that the network data hits in a particular device, depending on the direction of traffic. The second basic functional component is the switch fabric. The switch fabric forwards the traffic between the ingress and egress ports of a device across the bus or backplane of that device. The third component is host processing, which can encompass a range of operations that lie outside the path of the traffic passing thought a device. This can include controlling communication between components, enabling configuration, and performing network management functions. Host processors are usually off-the-shelf general purpose RISC or CISC microprocessors.
The final component is the packet processing function, which lies between the PHY layer and the switch fabric. Packet processing can be characterized into two categories of operation, those classified as fast-path and those classified as slow-path. Fast-path operations are those performed on the live data stream in real time. Slow-path operations are performed outside the flow of traffic but are required to forward a portion of the packets processed. Slow-path operations include unknown address resolution, route calculation, and routing and forwarding table updates. Some of the slow-path operations can be performed by the host processor if necessary.
For a piece of network equipment to be useful and effective, the vast majority of traffic must be handled on the fast-path in order to keep up with network traffic and to avoid being a bottleneck. To keep up with the data flow fast-path operations have always been limited both in number and in scope. There are five basic operations that have traditionally been fast-path operations: framing/parsing, classification, modification, encryption/compression, and queuing.
Traditionally the fast-path operations have been performed by a general purpose microprocessor or custom ASICs. However, in order to provide some programmability while maintaining speed requirements, many companies have recently introduced highly specialized network processors (NPUs) to operate on the fast-path data stream. While NPUs are able to operate at the same data rates as ASICs, such as OC-12, OC-48 and OC-192, they provide some level of programmability. Even with state of the art NPUs, however, fast-path operations must still be limited to specific, well-defined operations that operate only on very specific fields within the data packets. None of the current network devices, even those employing NPUs, are able to delve deep into a packet, beyond simple header information and into the packet contents while on the fast-path of data flow. The ability to look beyond the header information while still in the fast-path and into the packet contents would allow a network device to identify the nature of the information carried in the packet, thereby allowing much more detailed packet classification. Knowledge of the content would also allow specific contents to be identified and scanned to provide security such as virus detection, denial of service (DoS) prevention, etc. Further, looking deeper into the data packets and being able to maintain an awareness of content over an entire traffic flow would allow for validation of network traffic flows, and verification of network protocols to aid in the processing of packets down stream.
Accordingly, what is needed is a network device that can look beyond simple header information and into the packet contents or payload, to be able to scan the payload on the fast-path at wire speeds beyond 1 gigabit per second, and to be able to maintain state information or awareness throughout an entire data traffic flow.
The present invention provides for a content processor that is able to scan the entire contents of data packets forming a network data flow, the contents of data packets including both header and payload information. The content processor includes a queue engine, which is used to reorder out of order data packets and to reassemble fragmented data packets in the network data flow. A session id is used to associate each data packet with a particular flow. After being processed by the queue engine, a context engine schedules the scanning of the data packets. For scanning, data packets are broken into smaller blocks each block associated with a particular data packet, or context. To make the content processor more efficient, multiple contexts, each belonging to a different session, are processed simultaneously. Once scheduled, the contexts are sent to the content scanning engine to be scanned. The content scanning engine includes a string preprocessor which simplifies the string for scanning by compressing white space, etc. The content scanning engine then scans the data packets in two steps: first, the string memories which holds the database of known strings, is used to identify potential matches to the data packet; second, using the leaf string memories and the leaf string compare engine, it is determined whether there is an exact match between any identified potential match and the contents of the data packet.
A conclusion is generated in response to the scanning by the content scanning engine. The conclusion is programmable and can represent any information or instruction desired by the user. In general the conclusion will indicate one of a number of likely scenarios. For example, the conclusion will indicate that more scanning is required using the next block of data, that an action, or instruction, needs to be performed by the content processor, that information needs to be sent to the host processor for further processing, or when scanning is complete, that the packet is ready to be sent with the conclusion representing routing and quality of service treatment for the data packet. Instructions or actions to be taken are carried out by a script engine in the context engine, which is able to execute preprogrammed scripts. The context engine also includes a host interface, which is used for communication between the content processor and the host microprocessor.
The foregoing has outlined, rather broadly, preferred and alternative features of the present invention so that those skilled in the art may better understand the detailed description of the invention that follows. Additional features of the invention will be described hereinafter that form the subject of the claims of the invention. Those skilled in the art will appreciate that they can readily use the disclosed conception and specific embodiment as a basis for designing or modifying other structures for carrying out the same purposes of the present invention. Those skilled in the art will also realize that such equivalent constructions do not depart from the spirit and scope of the invention in its broadest form.