1. Field of the Invention
Embodiments of the present invention generally relate to data mining and, more particularly, to a method and apparatus for recursively analyzing log file data in a network.
2. Description of the Related Art
Presently, elements in a network employ various data logging processes to automatically record events in a certain scope in order to provide an audit trail. The network operator may use the audit trail for various purposes, such as diagnosing problems, tracking network access among users, and the like. In particular, a server in a network typically creates and maintains one or more server log files that contain a record of activity performed by the server for client devices. A typical example is a web server that maintains a history of requests received by client devices for web content. The data in a log file may be analyzed to obtain various types of statistics related to the activity of the particular network element.
In one type of log file analysis, network operators track security-related statistics, such as monitoring Internet access by client devices to detect requests for inappropriate or illicit content. Such Internet access monitoring is typically employed in an enterprise setting. Conventional analysis tools for detecting inappropriate Internet use rely on detecting particular words or phrases in log file entries indicative of content that has been deemed inappropriate or illicit for the particular environment. Entries containing such words or phrases are copied and stored in a result file. Such analysis tools, however, generate a substantial number of false matches. In addition, the result file includes an arbitrary sequence of entries without any useful organization of data. Accordingly, there exists a need in the art for an improved method and apparatus for analyzing log file data in a network.