The present invention relates to data classification in general, more particularly to a method for classifying a datum according to a set of filters and more specifically: to a method for classification of packets of information traveling across networks.
Due to the explosive use of networking (i.e. the Internet) in recent years, and with the advance of firewalls and service differentiation, there is an increased demand to build fast message filters machines (routers) that can filter packets according to data at various fields in the packet""s header at very high throughputs.
The filtering enables the routers to decide which packet to block, which to forward and at what priority, how much to charge the flow and what service to provide to each packet i.e., on which queue to place the packet. The starting point is the problem description and setup is given by V. Srinivasan, G. Varghese, S. Suri, and M. Waldvogel, xe2x80x9cFast and scalable layer four switchingxe2x80x9d, in Proc. ACM SIGCOMM 98, September 1998.
Each router has a set of filters by which packets should be filtered. Each filter consists of a set of K value-ranges. Each value-range specifies a range of values that are acceptable for a certain parameter in the packet header. A value-range is either exact, a range or star (*).
Exact value range specifies that the corresponding packet parameter has to exactly match the value given in the value-range. Exact range may be used for example to block packets coming from a particular source. A range value-parameter specifies a range of values in which the corresponding packet parameter should reside. A range value may be used to block packets arriving from a subnet. The range may be specified as a prefix or as a minimum and maximum values. A star (*) value-range specifies don""t care, i.e., the corresponding packet parameter may have any legal value.
Each message doing through the classifying machine has several parameters by which it is classified such as: destination address, source address, destination port number, source port number and protocol type. This information is contained in fields in the packet header. The classifier has a large set of filters; each one specifies a valid (acceptable) range of values for each of the message""s parameters in the packet header.
A packet matches a filter only if the value in each field of the packet complies with the corresponding value range in the filter.
Besides, filters are ranked according to a pre-determined priority. If a packet matches several filters, the highest priority filter it matches, is the filter which applies to the packet.
The specific classification problem for packets on the net is relatively new, but its relevance is increasingly growing because of the exponential increase of traffic across the net.
The best matching filter problem can be stated as a problem in computational geometry. Each filter specifies a K dimensional axis parallel box, and each packet with K parameters defines a point in K dimensions. The corresponding computational geometry problem is as follows: given a point and a set of axis parallel boxes in K dimensions, find the highest priority box in which the point resides. This problem is known as the Stabbing Query Problem in computational geometry. It was described by: M. de Berg, M. van Kreveld, and J. Snoeyink, xe2x80x9cTwo and three-dimensional point location in rectangular subdivisions.xe2x80x9d, in J. Algorithms, 18, 256-277, 1995.
The present art with regard to mechanisms for solving such classification problems is described by: T. V. Lakshman and D. Stiliadis, xe2x80x9cHigh speed policy-based packet forwarding using efficient multidimensional range matchingxe2x80x9d, in Proc. ACM SIGCOMM 98. September 1998, and by: Srinivasan et al., 1998. Both references deal with the best matching filter problem and provide efficient solutions to filtering packets according to two fields in the two dimensional case.
The time it takes to solve the best matching filter query problem in Srinivasan et al., 1998 is proportional to w, while using nw size memory, where w is the number of bits in a value (e.g., an IP address) and n is the total number of filters. The time it takes the hardware 30 solution suggested in Lakshman et al., 1998 to solve the two dimensional case, is proportional to (w+log n) and it requires 0(n) space. Note that logarithms herein are base 2 logarithms.
When packet classification is performed on more than 2 parameters tile solutions provided by these papers are not as attractive. The multi-dimensional solution in Srinivasan et al., 1998 suggests to maintain a cross product table, each entry of which corresponds to a set of packet headers that comply with exactly the same subset of filters. An arriving packet is independently and separately classified according to each of its header fields to give it the K coordinates in the cross product table.
If that table entry is in the cache, classification is done. If however that entry is not in the table, a linear search costing O(nK) time is used. The full size of the cross product table is d1xc2x7d2xc2x7 . . . dn, where di is the number of different value-ranges in the i""s dimension, i.e., the number of filters with different value-ranges in the i""s coordinate.
The solution of Lakshman et al., 1998 is in hardware and requires O(n) steps in the multi-dimensional case. Being a hardware solution it is inflexible, and hard to adapt to changes in the filters, which is a must in networking today.
Applications that rely on packet filtering may add and remove new filters many times during operation. For example, in a network that supports QoS (Quality of Service), a new filter is added to the routers along a path of a new flow that has particular QoS requirements. Therefore, we cannot rely on having all the filters preprocessed and present in a large cross product table stored in memory, or in hardware.
The 2 dimensional solution is not always satisfactory since there may be situations that a value-range of ranges is used in more than 2 values of the filters. For example in virtual networks it may be wanted to allow traffic from a range of sources to a range of destinations over a range of ports and protocols to go through the routers of our virtual private network.
In the general case, for filters with K dimension, the set of n filters is represented as:
filter 1: [lil,rI1]x . . . x[lK1,rK1]
filter 2: [ll2,r12]x . . . x[lK2,rK2]
filter n: [lln,r1n]x . . . x[lKn,rKn]
where the l and the r are the lower and the upper boundaries respectively of the filter in the appropriate dimension.
The packet is considered to be a point in K dimensions. The parameters of the packet received are: ("ugr"1, "ugr"2, . . . "ugr"K). These are called the coordinates of the packet.
The problem is then to find the highest priority filter in which the packet""s parameters reside, i.e.: the highest priority filter, j such that: [l1lxe2x89xa6"ugr"1xe2x89xa6r1l]and . . . and[lKjxe2x89xa6"ugr"Kxe2x89xa6rKj]
The packet satisfying these inequalities is said to stab filter j. To solve the problem of the filters being stabbed by the packet, several methods have been suggested which are briefly described below:
The most straightforward approach to solve the problem is to sort the filters in decreasing order of priority. A query is then carried out by scanning all the filters in the sorted order, each one in all dimensions, until the first filter that the packet stabs in all the dimensions is found. Such a solution requires linear space, linear O(Kn) query time, and at most O(log n) time to add a new filter to the sorted list. Another naive approach is to first find all the filters that the packet matches only in the first parameter, i.e., all the filters such that [llxe2x89xa6"ugr"Ixe2x89xa6rl]. Namely, all the filters that the packet stabs in the first coordinate. This is shown in FIG. 2A to which reference is now made. The next step is to solve the problem in Kxe2x88x92l dimension (e.g. using the mentioned Linear scan) on those subset of xe2x80x9ccandidatexe2x80x9d filters which have been stabbed wherein, in the resulting subproblem the first parameter call be ignored reducing the dimension of the problem by one.
To find all the filters stabbed by the packet first coordinate, the following operation and data structure (which belong to prior art) are defined:
a. 1. One Axis Projection:
Let EP be the set of all the beginnings and ends of value-ranges in one coordinate of a filter, as shown in FIG. 2B to which reference is now made, i.e.:
EP={llI.r1I.lI2,rI2, . . . l1n,rl}
Next, the elementary intervals that this set of points define are considered. An elementary interval is either the maximal interval between two points in EP such that no other point of EP lies in its interior, or, it is a point of EP.
There are at most 4nxe2x88x92l elementary intervals defined by the 2n points of the projection on one axis. I.e., let {I1,I2, . . . Im} be the result of sorting the elements of EP in increasing order. Then the elementary intervals shown in FIG. 2B are: {[Il,Il],(I1,I2),[I2,I2],(I2,I3), . . . [Im,Im]}, mxe2x89xa64nxe2x88x92l since some points in EP may be equal).
The important property of an elementary interval is that all the values in this interval stab exactly the same subset of filters in the corresponding coordinate. Therefore, with each elementary interval we associate the subset of filters that are stabbed in this coordinate by values in this interval.
In particular, the first stage of the search is performed by locating the elementary interval on the first axis that contains the first coordinate of the packet. Here are three possible ways to locate the elementary interval that is stabbed by the packet in the first coordinate:
a. 1.1 Binary search: A binary search is performed over the set of 2mxe2x88x92l elementary intervals. This takes O(log n) time. If however, with each elementary interval we store the list of filters that this interval stabs, the overall space required is in the worst case O(mn).
a. 1.2 Segment tree: Segment tree is a balanced binary tree over the elementary intervals. It facilitates a similar binary search. However its overall space requirement is only O(n log n). to store the lists of filters associated with each interval. Segment tree is described later in more detail.
The list of filters associated with each elementary interval is implicitly stored in the data-structure, this is described by: J. L. Bentley, xe2x80x9cSolutions to Klee""s rectangle problemsxe2x80x9d, in Technical report, Carnegie Mellon Univ., Pittsburgh, Pa., 1977.
a. 1.3 Best matching prefix. Since all the values in this problem are discrete each elementary interval can be presented as a disjoint union of 2W prefix intervals, where W is the logarithm of the maximum value in the first coordinate, see Srinivasan et al., 1998. A prefix interval is specified by a binary string s and it contains all the values such that s is the prefix of their binary representation. In Srinivasan et al., 1998, a method is given to represent all the axis elementary intervals as prefix intervals. This is achieved by blowing up the number of elementary intervals by at most 2W.
Interpreting the packet coordinate in the first axis as a binary string the prefix interval it stabs is computed in O(log W) time as described by: M. Degermark, A. Brodnik, S. Carlsson, and S. Pink, xe2x80x9cSmall forwarding table for fast routing lookups.xe2x80x9d, in Proc. ACM SIGCOMM 97. October 1997.
The space requirements of this scheme is similar to that of the binary search in the worst case, it is O(mn) (to store the lists of filters associated with each elementary interval).
Given the elementary interval I98  stabbed by the packet first coordinate the packet Filtering can be completed by searching over the set of filters associated with I"ugr" (i.e., the filters that are associated with that elementary interval).
One option is to perform the search linearly, however more efficient methods are presented below:
b. K dimensional Packet Indexing: Let the index in the first dimension of packet P be the index of the elementary interval in the first coordinate that this packet stabs. This index is denoted by il(P). For packet P we can repeat this process K times once for each of the K coordinates, resulting in a K element vector: (il(P), i2(P) . . . iK(P)).
This is the packet signature. Packet signature of a packet in two dimensions is shown in FIG. 3A to which reference is now made, where the signature of packet P is (12,8).
The time to compute the packet signature in K dimension is O(K log W) or O(K log n) depending on the method used. In Srinivasan et al., 1998, it is proven that all the packets that have the same signature are mapped to the same filter. It was also suggested there to use cacheing by signature to facilitate packet classification.
Even for a small number of parameters and filters, the resulting table of al possible signatures is too large to fit into primary or secondary mentors devices (consider for example 4 coordinates: source destination port, and protocol-type each inducing 20.000, 20,000.100. 3 elementary intervals, respectively. This combination results in a cache of size 12xc2x71010. This implies to a certain extent that even with the usage of hashing and cacheing the hit rate might be too low to be of practical use.
Furthermore, if the signature of a packet is not present in the cache we have to compute its classification from scratch. Namely, this approach only serves to delay the problem but does not provide a solution to the original problem.
For the general case, when the packet index is not in the cache, and no special properties of the packet can be used to reduce the problem into two dimensions the best method suggested in Srinivasan et al., 1998. is to perform a linear search over the relevant filters. This, in the worst case might cost O(Kn) time. This is perhaps acceptable today when the usage of QoS and or large firewalls is not that common, however we expect it to change in the near future. In particular, the ability to efficiently) handle the general case for a large number of filters will be a critical requirement.
Moreover, it is not clear how such a set of filters will look like in the future. i.e., how many parameters and over what value ranges.
c. Multi dimensional Segment-tree. The general packet classification problem was solved in computational geometry, e.g. by: M. de Berg, M. van Kreveld, M. Ovvermars, and O. Schwaarzkopf, in xe2x80x9cComputanional Geometry: Algorithm and applicationsxe2x80x9d, Springer-verlag, 1997. were it was referred to by M. de Berg et al., 1995 as: the stabbing query problem.
The best theoretical solution takes o(logKn) time. It uses multi-dimensional segment trees and requires O(n logkn) space. While asymptotically, for n going to infinity and constant K this is considered a good solution, for the typical values of n and K in the packet classification context there are better and practical solutions. Specifically this solution is impractical for the K and n values that are typical in our problem (K between 4 and 6 and n more than a hundred).
d. KD-tree. There are alternative solutions in computational geometry to the stabbing query problem whose query time is sub-linear in the number of filters, n. Although never applied before to packet classification, these solutions are expected to provide practical alternatives With acceptable performances when applied in this context.
One of these solutions actually solves the problem by converting it into its dual problem, called orthogonal range searching. In the dual problem there are given n points in 2K dimensions, and the query is to compute which of the points lies inside a given axis parallel box. A data-structure used to solve this problem is known as the KD-tree (M. de Berg et al., 1997). It requires linear space O(Kn) and can answer a query in o(nIxe2x88x92l 2K) time.
While KD-trees are simple, efficient, and are xe2x80x9cguaranteedxe2x80x9d to perform better than the naive linear-scan algorithm, this is a far cry from the theoretically exciting bounds provided by multi-dimensional segment trees (J. L. Bentley, 1977. M. de Berg et al., 1997), especially for K=1, 2.
To sum up: multi-dimensional Segment trees provide excellent solution to the problem for K=1.2, but for larger values of K, both their space and query time become unacceptable. On the other hand, KD-tree which provides general good performances for K xe2x89xa66, are not expected to performs as good as the segment tree for K=1. 2.
It is therefore a widely recognized need for an algorithm, which would overcome the disadvantages of presently known methods as described above that quickly (sub-linearly) classifies a packet with a relatively large number of parameters (e.g., 5, or 6) while requiring a near linear amount of space.
In this invention we present a simple and fast algorithm for multi-dimensional packet classification, solving the best matching filter problem.
Our simulated results demonstrate that in practice, our proposed data structure can handle a lot of xe2x80x9cdifficultxe2x80x9d filters quickly and efficiently.
We provide a general method for classifying a packet according to a set of filters which comprises the steps of: (a) providing at least two classification parameters in the packet, each said classification parameters having a value; (b) providing each filter with an allowable range for each of the values; and, (c) seeking among said filters, at least one filter that is stabbed by the values, using a KD-tree data structure.
The object of the invention is to provide a general algorithm that quickly (sub-linearly) classifies a packet with a relatively large number of parameters (e.g.. 5 or 6) while requiring a near linear amount of space. The algorithm combines, integrates and fine-tunes several classical data-structures known in computational geometry into an engine for fast packet classification. Part of the integration exploits special properties of IP traffic to facilitate the classification.