Detailed visibility into individual users and business applications using the global network is essential for optimizing performance and delivering network services to business users. In general, current network monitoring tools are able to collect a large amount of data from various information sources distributed throughout the network. For example, Snort Intrusion System for TCP (SIFT), uses an information dissemination server which accepts long-term user queries, collects new documents from information sources, matches the documents against the queries, and continuously updates the users with relevant information. SIFT is able to process over 40,000 worldwide subscriptions and over 80,000 daily documents.
Also, tracking and monitoring traffic in communication networks is particularly relevant for network vendors who wish to provide access to information on their high-end routers; they must therefore devise scalable and efficient algorithms to deal with the limited per-packet processing time available. Traffic monitoring tools are also useful to network providers, as it allows them to filter information relevant to implementing cost saving measures by optimizing network resources utilization, detecting high-cost network traffic, or tracking down anomalous activity in a network, etc. For example, in order to protect their network and systems today, network providers deploy a layered defense model, which includes firewalls, anti-virus systems, access management and intrusion detections systems (IDS). The capacity to detect as fast as possible the propagation of malware and to react efficiently to on-going attacks inside the network in order to protect the network infrastructure is becoming a real challenge for network operators.
Network performance monitoring mechanisms need to perform traffic analysis in a non-invasive way with respect to the observed networking environment. Detecting attacks and point-to-point traffic is a huge problem for network managers in order to better utilize and protect their networks. Providing information that may help them to do this with minimal cost may be a key differentiator between the services a network may offer to users.
From security point of view, a relevant metric to detect malware is to determine the number of distinct sources sending traffic to a monitored destination, referred to as “node fan-in”. Destinations with an abnormally large fan-in are likely to be the target of an attack, or to be downloading large amounts of material with a point-to-point application (e.g. BitTorrent). This is equivalent to determining the sources with the highest fan-out (number of distinct destinations from the source), by interchanging the roles of source and destination; this is known as “node fan-out”. Sources with an abnormally large fan-out may be attempting to spread a worm or virus.
Some of the tools used today for establishing the node fan-in or fan-out perform monitoring of all packets arriving at a node. These tools require that the respective node be equipped with sophisticated hardware/software for packet inspection at high speed. In addition, these tools require a large amount of memory for maintaining the tables with destination/source information for each packet. Evidently, looking at every packet arriving at a node is not practical for large traffic volumes and nodes that are not equipped with sophisticated, expensive hardware component.
Other current methods of traffic monitoring are for example “linear counting” (described by Whang, K.-Y., Zanden, B. T. V., and Taylor, H. M. in “A linear-time probabilistic counting algorithm for database applications”), or “loglog counting” (see details at http://algo.inria.fr/flajolet/Publications/DuFI03-LNCS.pdf), or “Superspreader algorithms” (see details at http://reports-archive.adm.cs.cmu.edu/anon/2004/CMU-CS-04-142.pdf), to list the most relevant. However, all these tools and algorithms have a number of drawbacks that dissuade their use on a large scale: they do not necessarily work with sampled data, are complicated, and require extensive additional programming.
A need has arisen for both the users and network operators to have better mechanisms to monitor network performance, filter network traffic, and troubleshoot network congestion, without introducing any additional traffic on the communication network. This is especially relevant to Internet providers that must comply with SLAs (Service Level Agreements) provided to customers. As Internet architecture evolves, the SLAs now include requirements on the quality of service such as jitter, throughput, one-way packet delay, and packet loss ratio. Additionally, the need to monitor network traffic is prevalent for the underlying Internet protocol enabling the World Wide Web.
In particular, there is a need to provide a tool for estimating the destinations with the highest fan-in and/or sources with the highest fan-out that operate with high accuracy and provide instant feedback. Such tools need also to operate in high-speed routers at line speed, without the need of additional complex HW/SW at the network nodes. There is also a need to provide a solution that is extremely simple to implement and exhibits excellent performance on real network traces.