With the increasing abundance of Internet Protocol (IP) network bearer services, and the increasing importance of network security requirements, it is not enough for a data communication device to merely identify information of four layers or below the four layers of the Transfer Control Protocol (TCP)/IP suite. The Deep Packet Inspection (DPI) technology may implement deep parsing of a packet, a working process of the DPI is a process of comparing a data stream of a packet payload with a character word base, and the corresponding processing is performed by determining whether the packet payload matches one or more in the character word base. The matching may include string matching and regular expression matching, and it is currently proved that the DPI efficiency is higher by using the regular expression matching. However, the operation complexity of the regular expression matching is in direct proportion to the length of the packet to be matched, and when the length of the packet is longer, the operation complexity of the regular expression matching is higher.
One of the current solutions is to use a modified regular expression matching method, and a process of this current solution is generally as follows: String filtering is first performed on the data stream of the packet to be matched, and regular expression filtering is then performed on the data stream passing through the string filtering, in which an algorithm used by the corresponding string matching or the regular expression matching is used in each filtering process. That is to say, the packet is first grouped, the length of each group is shorter, and the corresponding regular expressions are fewer accordingly, so the operation complexity may be reduced compared with the case that the regular expression filtering is performed on the total packet on the whole. In the prior art, a precise matching method is used during the string filtering, that is, the string filtering can be passed only when keywords in the data stream to be matched are completely same with the character words. Such a precise matching manner requires to store the keywords in the data stream to be matched, and probably, requires to further compare the keywords with the character words.
In the implementation of the present disclosure, the inventor finds that the prior art has at least the following problems. The keywords in the data stream need to be stored in the prior art, thereby occupying a larger space, and the performance may be reduced since it is required to further compare the keywords with the character words.