Many networked computer systems include one or more mechanisms for reporting on events occurring thereon. For example, many network communications devices (e.g., routers, bridges and switches) produce and transmit a notification (or “message”), for diagnostic and debugging purposes, upon processing a network-based event. The notification may, for example, describe the event and exactly how it was processed by the device. The notification may be transmitted on a network protocol, such that any device “listening for” the notification on that protocol is informed that the event was processed by the device. Examples of common network event notifications include “SYSLOG” messages, Simple Network Management Protocol (SNMP) messages, NetFlow messages, raw Transmission Control Protocol (TCP) packets, and other notification types.
A network event notification may contain the IP address of the device which produced it, and a hexadecimal code which indicates the result of processing the event. The code may indicate, for example, that a requested connection was established, or that a processing error occurred. Because every event processed by every device on a network typically yields at least one notification, the notifications may become voluminous if collected over time.
A number of systems exist for monitoring and analyzing network activity, including those which capture notifications, as well as other indications of network activity. These systems are typically designed to detect network events, load information relating to the events to a database, and provide an interface with which a user may analyze the information. However, the volume of network event notifications often significantly hinders these systems. Specifically, because loading any form of data to a conventional database (e.g., a relational database) can inflate the data significantly, the hardware and software components required to store data indicating network activity (particularly for a large-scale network) can be prohibitively costly. Moreover, as a database grows in size, the time and processing capacity required to access information stored therein typically progresses geometrically, not linearly. As a result, many network monitoring systems attempt to minimize the amount of data loaded to a database by summarizing, normalizing, or otherwise abridging it. This may become problematic because while not all network activity data has equal significance, different portions may be meaningful at different times, in unpredictable ways. Thus, abridging the data may remove a portion which has great significance to diagnosing a particular network issue.