Technical Field
The present invention generally relates to event identifiers and more particularly to automatically detecting event identifier (ID) content from heterogeneous logs and utilizing event ID content for log sequence analysis.
Description of the Related Art
Since event IDs are unique and would not usually be frequent, mining frequent patterns with event IDs and showing records they occurred in, provides an efficient way to mine frequent patterns in many types of databases including multiple tabled and distributed databases. Some techniques propose a set of algorithms for mining frequent patterns with their event IDs in a single transaction database, in a multiple tabled database, and in a distributed database. However, in those techniques the event ID attributes in the data base are specified manually, therefore this technique does not apply to heterogeneous system logs that are unstructured and have no attribute labels.
Other techniques propose a general methodology to mine heterogeneous system logs to automatically detect system runtime problems. They first parse console logs by combining source code analysis with information retrieval to create composite features, then analyze these features using machine learning to detect operational problems. Particularly on event IDs, they propose an algorithm to first automatically discover identifiers, then group together messages with the same identifier values, and create a vector per group.