Enterprises typically use traffic analysis tools that focus on traffic volume. Such tools identify the heavy hitters (e.g., flows that exchange lots of data, etc.), yet fail to identify the structure implicit in network traffic—do certain flows happen before, after or along with each other repeatedly over time? Since most traffic is generated by applications (e.g., web browsing, email, p2p), network traffic tends to be governed by a set of underlying rules. Malicious traffic, such as bot scans, e.g., low volume probes for vulnerable machines, also presents distinct patterns in traffic.
In light of the above and the ever-increasing amounts of data, networks and/or enterprises can be increasingly complex to manage, evaluate, and/or troubleshoot. For example, networks (e.g., enterprise networks, educational networks, etc.) can be built from multiple applications, protocols, servers, etc. which interact in unpredictable ways. Once a network is set-up, administrative tracking of the network is overwhelming. For instance, configuration errors can seep in, software upgrades happen or servers may be phased out. These are just a few occurrences that make the management of networks and evaluating traffic challenging. Conventional techniques, such as scripting cron jobs and correlating server logs, are tedious and do not scale appropriately.