The Internet has emerged as a critical communication infrastructure, carrying traffic for a wide range of important scientific, business and consumer applications. Network service providers and enterprise network operators need the ability to detect anomalous events in the network, e.g., for performing network management and monitoring functions, reliability analysis, security and performance evaluations, and the like. While some traffic anomalies are relatively benign and tolerable, others can be symptomatic of potentially serious problems such as performance bottlenecks due to network element failures, malicious activities such as denial of service attacks (DoS), and worm propagation. It is therefore very important to be able to detect traffic anomalies accurately and in near real-time, to enable timely initiation of appropriate mitigation steps.
An important property of effective anomaly detection is to be able to characterize, and therefore to isolate, the anomaly. For network service providers and enterprise network operators, characterization might be by through identifying one or more routers, one or more originating internet protocol (IP) addresses, one or more terminating IP addresses, packet type, and other characteristics taken from the packet header and packet payload.
One of the main challenges of detecting anomalies is the mere volume of traffic and measured statistics. For example, in a system that gathers data in a regular interval to obtain the events that are the basis of the anomalies, the events can impact multiple categories or classifications. The system needs to determine whether the current data is anomalous relative to historical pattern and current overall statistics for all the categories in real time or near real time and initiate mitigation steps. Given today's traffic volume and link speeds, the input data stream can easily contain millions or more of concurrent flows, so it is often impossible or too expensive to maintain the entire previously collected data stream. Methods designed for static analysis require adjusting the parameters used for estimation based on the entire collected data and are prohibitive.
Therefore, a need exists for a method and apparatus for near real-time detection of anomalies in streaming cross-classified event data for networks, e.g., data, streaming media, VoIP or SoIP networks.