Classification of traffic traveling around a data network makes it possible to decide on behaviors to be adopted for each traffic flow as a function of its classification.
For example in a firewall, a security system setup generally relies on recognition of protocol properties so as to prevent certain transfers.
Again for example, equipment for managing quality of service, allocate priorities to data as a function of complex rules which describe scenarios. A correspondence between these scenarios and data packets conveyed within connections uses techniques for classifying these connections.
Again for example, network monitoring equipment produce statistics for measuring and controlling the state of the network at a particular point. This requires a classification and recognition of the various streams which flow through this point.
Again for example, classification of various streams is useful for billing services, since the costs vary depending on whether these services are of audio, video, electronic messaging or database enquiry type. Moreover, it is often essential to correctly identify users of these services in order to guarantee the billing thereof.
The operations for controlling and managing networks thus require classification of connections between various senders and receivers which generate digital data streams over these networks. This requires powerful and reliable methods of classification.
According to the known state of the art, a data packet observation task is assigned to a node of the network such as for example a proxy server through which there pass connections which generate these data packets.
Patent application WO 0101272 discloses a procedure and an apparatus for monitoring traffic in a network. Pattern recognition techniques (also known as pattern matching) applied to predetermined fields of analyzed data packets make it possible to identify a protocol which follows a protocol previously identified in a connection protocol stack, on condition that the protocol previously identified makes it possible to determine the fields and the patterns or values to be recognized therein to identify the following protocol or protocols.
Among such explicit protocols is found the Ethernet protocol for which the packet header specifies whether the following protocol in the protocol stack is for example the LLC protocol or the IP protocol possibly together with its version. Likewise the packet header under IP protocol specifies whether the following protocol in the protocol stack is for example the TCP, UDP or ICMP protocol.
A problem which arises is that of the recognition of implicit protocols. A protocol is said to be implicit when it is not explicitly identifiable in a definite manner by a protocol header which precedes it in the protocol stack. Such is the case for numerous application-level protocols such as Pointcast or Kazaa, use of which in the protocol stack of a connection depends on the connection's context generally established by prior negotiations, that are difficult to compile with real-time scanning along with the flow, of the packets traveling around the connection.
Certain known protocols such as the HTTP, Telnet, FTP protocols are today at the limit of explicit and implicit protocols. These protocols may be regarded as explicit when a reserved port number figuring in a TCP protocol header gives a destination indicator which makes it possible to identify in a definite manner the protocol which is transported, for example a number 80 corresponding to the HTTP protocol, a number 23 corresponding to the Telnet protocol, a number 21 corresponding to the FTP protocol. A client station uses for example under TCP, the port number 80 to establish an HTTP enquiry connection with a server station by allotting a dynamic port number to a peer connection which allows the server station to respond to the client station. It will be remarked here that the explicit nature of the HTTP protocol over the peer connection for conveying the responses of the server station to the client station, is lessened through the dynamic allocation of a port number, related to the context of the enquiry connection. Moreover, today nothing prevents a client station from negotiating beforehand with the server station, a port number distinct from the number 80 for the HTTP enquiry connection. In this case, the HTTP protocol is more implicit than explicit. This remains true for other protocols. Moreover, an enquiry connection under the FTP protocol engenders in a known manner other dynamic connections for the actual transfer of the files, the enquiry connection and its peer connection being used for the transfers of control. Within the dynamic connection or connections engendered, the port numbers do not make it possible to explicitly recognize the FTP protocol. An application of filters to the field of the port number under TCP, does not make it possible to identify the protocol transported in a definite manner.
Another problem which arises is that of the recognition of protocols whose implementation varies both through the architecture of their use and through the incessant creation of new protocols.
For example a conventional architecture is known for using the Telnet protocol by stacking the ordered sequence of protocols Ethernet, IP, TCP, Telnet. Other architectures are possible by stacking the ordered sequence of protocols Ethernet, IP, TCP, HTTP, Telnet or again Ethernet, IP, IP, TCP, HTTP, Telnet to manage roaming.
The systems of the prior art find it hard to accommodate protocol modifications of architecture by modifying dependency links between existing or new protocols when these systems are based on recognition of patterns in fields determined by these dependency links to identify protocols used. This drawback is particularly apparent in hardware systems for which any confrontation with connections established according to nonscheduled protocol architecture requires a reconstruction for the sake of efficiency.