Traffic classification refers to the classification of the received packet, which is one of key functions of a router and provides technical support for complicated value-added services of the router such as network security, QoS (Quality of Service, quality of service), load balancing and traffic counting.
A basic idea of a traffic classification method based on a decision tree is as follows: a rule set is recursively divided through a certain segmentation policy, till the number of rules in each sub-rule set is less than a preset Bucket Size (bucket size); and a decision tree may be created through segmentation, in which an intermediate node of the decision tree stores a method for segmenting the rule set, leaf nodes store the sub-rule sets, that is, the leaf nodes store all possible matching rules.
During the classification of the received packet, related domains are first extracted from packet headers to form keywords, and then the created decision tree is traversed with the keywords and the keywords are compared with the rules in the leaf nodes, and finally, rules with the highest priority and matching the packet may be obtained. Algorithms based on the decision tree include HiCuts (one-dimensional segmentation), HyperCuts (multi-dimensional segmentation) and Modular (bit-selection segmentation).
However, in the traffic classification methods based on the decision tree, because wildcards ‘*’ exist in the rule, it is hard to avoid rule duplication, resulting in problems such as growing memory usage and low segmentation efficiency.
For the above problems, in the prior art, a solution to improving the traffic classification method based on the decision tree is as follows: first, an original rule set is divided into several non-overlapping sub-rule sets, and then the decision tree is created with the obtained sub-rule sets.
A process of dividing the original rule set into several sub-rule sets may be implemented in the following manners:
1) classifying the rule set according to a prefix, for example, during the classification of standard Ipv4 quintuple rules, the rules may be classified according to a prefix of a source IP and/or a destination IP address therein; and
2) classifying the rules according to a range, for example, during the classification of the standard Ipv4 quintuple rules, the rules may be classified according to a range of a source port and/or a destination port.
If the original rule set is divided with respect to merely one domain, the subclasses obtained in the manners 1) and 2) are the required sub-rule sets. If multiple domains exist in the original rule set, for example, it is possibly required to divide the Ipv4 quintuple rules with respect to 5 domains; and at this time, subclasses obtained by using different classification methods may be formed into different combinations according to an intersection product method, and then multiple non-overlapping sub-rule sets are obtained. If the original rule set is divided according to one address domain and one port domain, first, the original rule set may be divided into s1 subclasses and s2 subclasses respectively by using the methods described in the manners 1) and 2), and then the original rule set may be divided into s1*s2 sub-rule sets by using the intersection product method.
By using the improved traffic classification algorithm based on the decision tree, the original rule set may be divided into “fully” non-overlapping sub-rule sets, which reduces rule duplication at a certain degree. However, in a process of classifying the packet by using the above improved traffic classification algorithm, the inventors find that the prior art at least has the following problems.
The rule duplication occurs in dependence on whether the wildcards “*” exist at bits for segmentation in the rule during the segmentation, not on whether domains of the rules overlap. Therefore, the above solution is merely applicable to the traffic classification algorithm for segmentation fully according to the domain.