1. The Field of the Invention
The field of the present invention is the classification of network communication packets processed in a network stack. More particularly, the invention presents a generalized packet classifier that may be used to classify network communication packets from different software components, such as drivers, that each may have different purposes for making the classification.
2. Present State of the Art
Over time, the usefulness and benefits of stand-alone computing devices, such as the ubiquitous personal computer, have been leveraged by allowing many of such computing devices to communicate one with another over a communications network. Network communication between computers allows many different kinds of applications to exist that are otherwise not possible with a stand-alone computing device. One of the more common and useful applications is simple messaging that many people use in order to communicate by electronic mail, also known as email.
For communicating over a network, information that is to be transported from one computer to another is divided into a number of network communication packets. These network communication packets (also known simply as "packets") will eventually be transported across the physical communications network. In the PC environment, transmission is handled by a network interface card residing in the personal computer. Throughout this application, the PC environment will be assumed though the discussion and application of the concepts apply to many different network computing environments as will be appreciated by those skilled in the art.
The information originating at an application program running on a PC becomes packetized into network communication packets by passing through various software components before arriving at the network interface card for transmission on the physical communications network. The software components are typically layered drivers interconnected as appropriate to form what is known as the network stack.
Each driver or other processing component will process the data in succession as the original data is broken into successively different packetizing schemes from one level to the next down the network stack until the data is formed into packets that are transmitted by the network interface. The term "network communications packets" refer to any of the data packets used in the network protocol stack regardless of actual format. The original data is progressively packetized and formatted as it progresses down through the driver layers. For example, a TCP layer encapsulates the data in TCP packets, each of which may be further fragmented into multiple IP packets by an IP layer.
Typically, each driver has a particular function specified by a certain protocol or other constraint. For example, one driver may manage a network protocol such as the internet protocol (IP) while another may manage the actual network interface card. Such modularity allows a variety of different user configurations to be made without rewriting code. For example, the network protocol may be disassociated from any particular physical transmission system. In other words, by using interconnected drivers in the network stack, the same network protocol driver may be used with various physical interface drivers according to the particular physical configuration.
In order to send an email message from one computer to another, the text of the information is placed in a higher level protocol packet along with some packet header information and passed from the application into the network stack. Each element of the network stack may add additional header information and make processing decisions based on the information in the packet itself or any of the header information previously created by higher level drivers. Furthermore, packets at one level of the network stack may be broken down or recast into multiple packets at another level of the network stack. Eventually, all data will be packetized into packets suitable for transmission over the network interface.
Sending an email message may be viewed analogously to sending a letter by regular mail. The body of the message itself is created in both instances by the user. There are a number of processing steps for mailing a letter by regular mail before it is placed into the custody of the postal service (analogous to sending a packet over a communications network). An envelope must be procured, an address written on the envelope, and postage affixed to the envelope prior to placing the letter into the mail box. Each intermediate step may be thought of as header information (addressing, envelope, etc.) created by individual drivers processing an email message for delivery over a communications network between computers.
At the receiving end, packets are passed up the network stack. Each element of the network stack may remove a portion of the header information and make processing decisions based on the information in the packet itself or any of the header information not removed by lower level drivers. Furthermore, multiple packets at one level of the network stack may be combined into aggregate packets at another level of the network stack.
FIG. 1 shows an example of a network stack with an application program 20 at the highest level and a number of drivers at each successive level before reaching the network interface card 22, namely, driver A 24, driver B 26, driver C 28 and driver D 30. Each driver will perform processing in association with the packet before passing the packet on down to the next driver.
A packet is "classified" for certain processing in a given driver based on information about the packet that is contained in the headers or from information inside the data portion of the packet itself. Based on its classification, a packet will be processed by the driver differently and as shown in FIG. 1, a driver that needs to make a packet classification or make decisions based on information in the packet has a special portion of driver code called a packet classifier. In FIG. 1, driver A 24 will use packet classifier 32, driver C 28 will use packet classifier 34 and driver D 30 will use packet classifier 36. As mentioned previously, the packetization of the original data may be of a different format at each different driver. Note that driver B 26 will process packets in such a way that no packet classification is needed.
Each driver will perform different kinds of classification depending on the driver's purpose. In order to better appreciate the different classification scenarios, a number of different types of classification of network communication packets are now provided. The simplest form of classification involves comparing a certain value of a packet with a specific value. The classification being based upon matching a particular value as shown in the example of Table 1 below.
TABLE 1 ______________________________________ Classification Destination Address ______________________________________ 0 11.22.33.44 1 11.22.55.66 2 11.22.77.88 3 11.23.34.45 4 11.23.45.67 5 12.34.56.78 ______________________________________
Table 1 illustrates six different possible classifications for a packet based on a packet's exactly matching one of the destination addresses listed in the table. For example, a packet will be classified as belonging to classification 0 if and only if its destination address is "11.22.33.44". Similarly, an exact match of the destination address field for a particular packet to be classified is necessary for each of the other five classifications.
Another form of classification that becomes slightly more complex involves comparing two or more fields from a packet to be classified with specific reference values that all must be matched in order to achieve a classification. This is shown below in the example of Table 2.
TABLE 2 ______________________________________ Classification Destination Address Destination Port ______________________________________ 0 11.22.33.44 1 1 11.22.33.44 2 2 11.22.33.44 3 3 11.22.55.66 1 4 11.22.55.66 2 5 11.22.55.66 3 ______________________________________
Table 2 illustrates six different multiple-field classifications having a value for the destination address and the destination port. For example, a packet will be classified as belonging to classification 0 if and only if both its destination address is "11.22.33.44" and its destination port is "1". In like manner, in order for a packet to be classified according to the other classifications shown in Table 2, exact matches of both the destination address field and the destination port field as found in the network packet must match corresponding entries exactly for each of the other five classifications.
In Table 3 below, wildcards (as represented by an `x`) are introduced into the reference specification to illustrate a more complex form of classification. The values in the table having wildcards may match more than one value.
TABLE 3 ______________________________________ Classification Destination Address Destination Port ______________________________________ 0 11.22.33.44 1 1 11.22.33.44 2 2 11.22.55.66 X 3 11.22.77.XX 1 4 11.23.XX.XX 2 5 11.24.45.XX X ______________________________________
Of the six classifications shown in Table 3, those with a wildcard value may match more than one value or set of values from a packet to be classified. In other words different values found in different packets may still receive the same classification. Classifications 0 and 1, on the other hand, contain no wildcards and thus are exact reference specifications, just like those found in Table 2 requiring an identical match in order for a classification to occur. However, classification 2 will be matched by all packets having a destination address value of "11.22.55.66" irrespective of the value of the destination port filed in the packet. Similarly, classification 3 will be met for all network packets whose destination address begins with "11.22.77" and have a destination port field value of "1".
Essentially, wildcards allow a shorthand representation of a set of specific classifications and can be used advantageously in situations where all of the processing associated with a group of specific classifications is the same. Such a classification containing wildcards or otherwise allowing different values that will result in a match can be referred to as general classification.
Because general classifications cover multiple different values, there exists the potential for overlapping classifications. In other words, a given network packet may be legitimately classified in more than one classification. Such overlapping classifications occur in two varieties: (1) a subsuming or hierarchical overlap where each and every value of one classification will be contained in another more general classification, and (2) partial overlap where one classification will share some, but not all, values with another classification. In a subsuming overlap, a packet that matches the more specific classification will by definition match the more general classification while in a partial overlap a given packet may or may not fit into both classifications depending on the actual values of the relevant packet fields.
Since classification generally requires that a single best classification category be returned, rules and other criteria must be implemented to return a single classification when an overlap condition results. It is often the case that, when a packet matches multiple classifications, the best match is considered to be that which is most specific. Therefore, in the case of a subsuming overlap, the most specific classification would generally be used.
In a partial overlap, a winning classification must be selected based on other criteria since neither classification is more specific than the other. One example criterion is an explicit and distinct priority attribute associated with each classification that has a partial overlap condition. Such a criterion could be used to arbitrate between the overlapping classifications so that a best match may be made in each instance.
Below, in Table 4, examples of subsuming overlapping and partial overlapping classifications are shown.
TABLE 4 ______________________________________ Classification Destination Address Destination Port Priority ______________________________________ 0 11.22.33.44 1 0 1 11.22.33.44 X 0 2 11.22.55.66 1 0 3 11.22.55.XX 1 0 4 11.22.XX.XX 2 1 5 11.22.55.XX X 0 ______________________________________
Table 4 specifies six classifications, some general and some specific. A packet with a destination address of "11.22.33.44" and a destination port of "1 " will match both classification 0 and classification 1. Since classification 0 and classification 1 are subsuming overlapping classifications, according to the general rule the best match would be the most specific classification which in this case would be classification 0.
A packet having a destination address of "11.22.55.66" and a destination port of "2" will match both classification 4 and classification 5. This is a partial overlap situation since neither classification 4 nor classification 5 is more specific than the other and resort to the priority information associated with the overlapping classifications is made in order to find the best match. Since classification 4 is shown to have a lower priority than classification 5, the best match would be classification 5.
In order to simplify the process of classification, certain terms are used throughout this application. A "pattern" is all the classification criteria concatenated together in a certain order. For example, a reference pattern would be the concatenation of the destination address and the destination port as found in Table 4 for each classification. In other words, Table 4 would contain six reference patterns that may be matched by a corresponding classification pattern created by the actual values of the fields in a network communication packet. Again, a classification pattern is created by placing actual values taken from the packet into the prescribed order so that it may be compared with a number of reference patterns in order to arrive a particular classification should a match occur.
Generally, packet classifiers are developed independently in each driver as part of the driver code development. Because packet classification is similar for all drivers in many respects, this represents a duplication of effort and added complexity in driver code development. This inefficiency exhibits itself in the form of the extra lime taken for the packet classification code development as well as the added time for debugging and maintaining the extra code.
Another problem is the repeated classifications that must occur by each driver for the same packet during run time. Since each driver may perform the same classification for each packet as it passes it up or down the protocol stack, redundant processing commonly results.
What is needed is a centralized packet classifier that is accessible by all drivers, or other clients, and that can be used by each individual driver according to the specific purposes of the particular driver. A generalized and centralized packet classifier will reduce code development for drivers requiring packet classification and further allows features added to the centralized packet classifier to be immediately available by driver developers.
Two main problems exhibit themselves and must be solved for a centralized packet classifier to have any meaningful acceptance by driver developers. First, performance for the actual classification for the generalized and centralized packet classifier must be adequate so that the driver may accomplish its purpose within adequate time criteria. A centralized packet classifier that is used by many drivers will tend to have a larger database of potential classifications or reference patterns than an individualized classifier for a single driver, and the larger reference pattern database may impact the efficiency of classification.
Another problem associated with a generalized and centralized packet classifier is that may not be flexible enough to meet the customized needs of a particular driver. When making individualized packet classifiers as part of driver development, the driver developer may customize and tune the driver code for optimum performance and applicability to the desired purpose. In order to provide a clean interface to all drivers or clients to the classification services, a centralized packet classifier may not be adequately flexible for all drivers.
What is needed is a generalized and centralized packet classifier that may provide classification services to drivers or other clients in a manner that provides enough flexibility for clients with different needs and purposes to benefit therefrom on a functional basis. Furthermore, such a centralized packet classifier must of necessity have adequate performance characteristics and in many instances must be comparable in performance to classification that may be achieved by an individualized packet classifier.