DPI (Deep Packet Inspection, deep packet inspection) technology is an application layer—based flow detection and control technology. As the applications of content charging and various value-added services develop, in the application of the DPI technology, in addition to protocol recognition on the flow content (for example, the flow content is recognized as an HTTP protocol or other protocols), protocol parsing of the flow content is required.
During DPI processing by using the prior art, the quintuple information in the data flow packet, including the protocol field (Type), source port, destination port, source IP address, and destination IP address, is extracted from the data flow packet. Then, flow table matching is performed (the flow table stores the correspondence between the recognized protocol and the quintuple). The protocol type is recognized by performing flow table matching.
If the protocol type of a packet cannot be recognized by performing flow table matching, protocol recognition is performed on the packet, including signature recognition, associated recognition, and heuristic recognition, to recognize the packet type and update the flow table. In addition, for the packet of which the protocol type is recognized during flow table matching, it is determined whether to parse the protocol. If yes, the protocol parsing of the packet is performed, and the content of certain keywords (also called fields and key fields) in the packet is parsed by performing protocol parsing.
During protocol parsing by using the prior art, the method of scanning bytes one by one is used. Supposing that the HTTP header has the following content:
GET/cn HTTP1.1\n\r Accept:image/gif, image/x-xbitmap, image/jpeg\n\r
The parsing is performed according to the HTTP format, starting from the letter G For example, when GET is obtained after the parsing, if a space is obtained after the parsing, it may be learned that the protocol version number (HTTP1.1) appears in the next position after several characters (/cn) and another space.
During the implementation of the present invention, the inventor discovers the following disadvantages in the prior art:
When protocol parsing is performed by scanning bytes one by one, if new rules need to be parsed, the distribution rules for various fields in the new protocol packet need to be known. This process requires a large amount of time. The process is complicated, and does not facilitate extension of new rules.