1. Field of the Invention
This invention relates to computer network monitoring. More particularly, it relates to handling the log data generated by such log-producing devices and processes as network firewalls, routers, file servers, VPN servers, operating systems, software applications and the like.
2. Description of the Related Art
Computer networks in general, and private networks such as Local Area Networks (LANs) and intranets in particular, require security devices and processes to protect them from unauthorized access and/or manipulation. A computer firewall is one such device. At the simplest level, it may comprise hardware and/or software that filters the information coming through a network connection (most commonly an Internet connection) into a private network or computer system. If an incoming packet of information is flagged by the filters, it is not allowed to pass through the firewall.
A firewall can implement security rules. For example, a network owner/operator might allow only one, certain computer on a LAN to receive public File Transfer Protocol (FTP) traffic. The FTP protocol is used to download and upload files. Accordingly, the firewall would allow FTP connections only to that one computer and prevent them on all others. The administrator of a private network can set up rules such as this for FTP servers, Web servers, Telnet servers, and the like.
Typically, firewalls use one or more of the following methods to restrict the information coming in and out of a private network:                packet filtering—data packets that meet the criteria set of the filter are allowed to proceed to the requesting system while those that do not are blocked from further transmission.        proxy service—information from an external network (such as the Internet) is retrieved by the firewall and subsequently sent to the requesting system. The effect of this action is that the remote computer on the external network does not establish direct communication with a computer on the private network other than the proxy server.        stateful inspection—a comparison of certain key parts of data packets to a database of trusted information. Data going from the private network to the public network is monitored for specific defining characteristics and incoming information is compared to those characteristics. If the comparison is a match within defined parameters, the data is allowed to pass through the firewall.        
A company might also use a firewall to block all access to certain IP addresses or allow access only to specific domain names. Protocols define how a client and server will exchange information. Common protocols include: Internet Protocol (IP), the main protocol of the Internet; Transport Control Protocol (TCP), used to disassemble and assemble information that travels over the Internet; Hypertext Transfer Protocol (HTTP), used for Web pages; File Transfer Protocol (FTP), used to download and upload computer files; User Datagram Protocol (UDP), used for information that does not require a response such as streaming audio and video; Internet Control Message Protocol (ICMP), used by a router to exchange information with another router; Simple Mail Transport Protocol (SMTP), used to send text e-mail; Simple Network Management Protocol (SNMP), used to obtain system information from a remote computer; and, Telnet, which is used to execute commands on a remote computer.
A company might use a firewall or a router to enable one or two computers on its private network to handle a specific protocol and prohibit activity using that protocol on all of its other networked computers.
Similarly, a firewall may be used to block access to certain ports and/or permit port [#] access only on a certain computer.
Firewalls can also be set to “sniff” each data packet for certain words or phrases. For example, a firewall could be set to exclude any packet containing the word “nude.” Alternatively, a firewall may be set up such that only certain types of information, such as e-mail, are allowed to pass through.
Many IT devices and processes produce a log of their activities (hereinafter “raw log data”). One particular type of raw log data is known as “syslog data.” Log data from VPN servers, firewalls and routers commonly comprises date and time information along with the IP addresses of the source and destination of data packets and a text string indicating the action taken by the data log-producing device—e.g., “accept” or “deny” or “TCP connection dropped.” An example of a raw log data from a Virtual Private Network (VPN) server is reproduced in Table I. Log data from other sources comprises information relevant to the providing source. An example of raw log data from an e-mail server (“sendmail” log data) is reproduced in Table II.
It will be appreciated that periods of high network activity generate large quantities of log data. During an attempted security breach, it may be necessary for network administrators to access the log data to determine the nature of the attack and/or adjust the security parameters in order to better defend against the attack. Although systems may provide a means for viewing the log data in real time or near real time, the sheer quantity of data generated makes it largely impractical to manually glean useful information from raw log data. Accordingly, systems and methods have been developed for parsing and summarizing log data in databases upon which queries may be run in near real time to retrieve relevant information.
A system and method for parsing log data is disclosed in commonly-owned U.S. provisional patent application Ser. No. 60/525,465 filed Nov. 26, 2003, and a system and method for summarizing log data is disclosed in commonly-owned U.S. provisional patent application Ser. No. 60/525,401 filed Nov. 26, 2003, both of which are hereby incorporated by reference.
Although parsed and summarized data is often more useful and convenient for monitoring network performance, real-time network troubleshooting and the optimization of security parameters, regulatory compliance and/or company policy may necessitate the storage of raw log data. Inasmuch as the above-described systems stored parsed log data and only later forwarded the raw log data, the reliability of the full raw log data streams was reduced. Furthermore, delay issues complicated the raw log data storage and the growing volume of log data created logistical problems. The present invention solves these problems.