In analyzing traffic flow over data networks, for various reasons, such as billing, security, QoS (Quality of Service), usage data, and the like, it is essential to have clear and precise information concerning the traffic classification in a digital communication network (e.g., Internet) transmitted between a source computer (servers, routers, personal computers, etc.) and a destination computer (end user terminals, terminals, servers, routers, etc).
Investigating traffic flow can take a lot of processing time and power to monitor and classify, and both the amount and speed of traffic data, especially Internet traffic data, are ferociously increasing. Systems for traffic flow analysis very often encounter several obstacles, which take place at the level of the traffic flow passage due to various types of heavy processing required in order to obtain a semantic, reliable, and useful classification and processing of network traffic.
Classification of traffic travelling around a communications network makes it possible to decide on behaviours to be adopted for each traffic flow as a function of its classification. That is, before a data packet can be adequately processed, classification of the traffic flow permits the network components to classify the data packets according to the various characteristics of the packets and information contained in the packet. Thus, accurate and efficient data processing depends largely on reliable methods of packet classification. After the packet is classified, the network components can determine how to properly handle and process the packets.
For example, in a firewall, a security system setup generally relies on recognition of protocol properties to prevent certain transfers, and in devices for managing quality of service, such devices allocate priorities to data as a function of complex rules which describe various scenarios. A correspondence between these scenarios and data packets conveyed within connections uses techniques for classifying these connections.
Furthermore, analysis and classification of packets often involve the complex task of constructing protocol attributes, i.e., determining the ordered sequence of protocol names used in the semantic stream of data and the parameter names carried by a protocol. Building such a graph or knowledge base to recognize different protocols is a very heavy task because of the increasing numbers of new protocols used in packet communication networks, as well as the number of protocol modifications and new dependency links.
Generally, the analysis of traffic flow on such networks is supported by inserting traffic analyzers at specific locations of the communication link. That is, a data packet observation task is assigned to a node of the network such as, for example, a proxy server where connections pass through, which generate these data packets. Thus, existing traffic flow analyses can be performed in computer networked systems where generally, a communication link connects (1) terminals running applications and processing user requests; (2) access points interfacing the workstation and the network, which are commonly modems associated to processing entities of the type “set top box”; (3) a concentrator, which collects the access link of a number of users; (4) a transmission network for providing the data transfer service; and (5) a server providing the data to the users. An additional problem is that if traffic is encrypted, packet inspection will be impossible unless classification of the packets by the access points occur prior to any of the encryption steps.
This type of architecture or framework is used in popular transmission systems such as DSL, cable, or FTTH. Other existing transmission networks can include similar types of architectures such as mobile network systems.
These locations can be chosen to be representative of the global traffic in the network. However, in addition to posing accuracy and efficiency issues, this approach requires a system with increasing processing power to support the incessant increase of traffic. In other words, processing power cannot be adjusted based on the requirements for a single user or workstation because network configurations having analyzers require a large and inefficient amount of processing power. Moreover, the treatment of traffic encryption must be addressed.
Therefore, it would be desirable to implement a new method and system to address the inaccuracy of measurements and efficiency problems by distributing the function of the analyzer between several components located in a distributed traffic analysis system. This would address the processing power issues by distributing the analysis function over several components, which would improve the quality of traffic analysis regardless of the transmission speed, and provide a flexible and extensible system and method to record accurate network performance and behaviour.