Technical Field
The present invention relates to data processing, and more particularly to heterogeneous log analysis.
Description of the Related Art
There exists some work on homogeneous log analysis, which are typically customized to one single specific application or service. In particular, domain knowledge about the application/service and in particular the log formats/semantics are completely available and the homogenous log analysis tools fully utilize such knowledge. The problem with such homogeneous log analysis tools is that once the system has updates, the homogeneous log analysis tools have to be manually updated. In addition, they significantly lack generalizability to other arbitrary systems and applications.
Some homogenous log analysis tools largely utilize mining algorithms to identify the most common/frequent log sequence patterns from logs data and they use such frequent patterns as a normal model for anomaly detection. Such methods typically suffer from scalability issues and they cannot be applied to arbitrarily large systems. In addition, the anomaly detection performance is dependent on the system parameters very sensitively, which makes the system configuration difficult.
Some homogeneous log analysis tools provide analysis over the system but with strong bias regarding the nature of the system behaviors (e.g., sequential ordering of certain events, causality relations among events, etc.). Typically, prior knowledge about the system is accessible and thus the analysis is designed based on/to conform to such knowledge. Such systems also have limitations on their applicability to other systems of different or unknown natures.