1. Technical Field
The disclosure of the application is related to systems and methods for determining top spreaders in high speed networks.
2. Related Prior Art
Efficiently and accurately identifying hosts that are spreading the largest amount of flows during an interval of time, so called top spreaders, is very important for managing a network and studying host behaviors on application level, ranging from detecting DDoS attack, worm propagation, peer-to-peer hot spots and flash crowds. No previous work has been able to efficiently and accurately identify the top spreaders at very high link speed, for example, 10 to 40 Gbps.
There has been a lot of work on measurement of traffic statistics for network management, security, and better understanding of internet and its evolvement. The size distribution and matrices of the flows may help network provisioning and traffic engineering. Finding flows that have a large number of packets is useful in billing and accounting. It has also been shown that flow level communication patterns may further reveal application level behaviors of each host.
To tell whether a host is a top spreader, it always needs to test if a flow count of the host is above a threshold according to one known method in the art. However, it's difficult to fix such a threshold. Even if the threshold can be fixed, there will be either too many or too few top spreaders. For most situations, it should be more interested in only a few top spreaders and their accurate flow numbers. However, no previous work has been able to accurately identify top spreaders on very high speed links in a large network, for example, under the speed of 10 to 40 Gbps, where the total host number is around hundred of thousands and the total flow number is around several millions, which happen on ISP backbone links.