Event-based systems and particularly the concept of Complex Event Processing (CEP) have been developed and used to control business processes with loosely coupled systems. CEP enables monitoring, steering and optimizing business processes with minimal latency. It facilitates automated, near real-time closed-loop decision making at an operational level to discover exceptional situations or business opportunities. Typical application areas are financial market analysis, trading, security, fraud detection, logistics like tracking shipments, compliance checks, and customer relationship management.
In an event-based system, any notable state change in the business environment is captured in the form of an event. Events are data capsules holding data about the context of the state change in so called event attributes. Chains of semantically or temporally correlated events reflect complete business processes, sequences of customer interactions or any other sequence of related incidents.
For the analysis of historical event data, but also for an operational event-based system, one question is of particular interest: Having an event sequence on hand, which other sequences are similar to this sequence? For data analysis, answering this question helps for searching the historic data for incidents and event patterns similar to a known reference pattern. In the operational system, the discovery of similarities can be integrated into the decision processes for automated system decisions to react in near real-time to certain event pattern. In addition, it can be used for forecasting of events or process measures based on similar historic incidents.
Current approaches towards similarity searching in event sequences are limited in various ways. Time-series similarity allows for discovery of similarities in numeric value sequences. Yet, inhomogeneous event data, consisting of attributes of arbitrary data types can only partially be processed, and a discovery of matching sub-sequences is not possible. In addition, no flexibility in modelling and constraining the comparison process is given and comparison of attributes is limited.
Accordingly, there is a need to provide an improved method for detecting reference sequences in arbitrary sample sequence which allows for the processing of inhomogeneous event data, the detection of sub-sequences, and/or flexibility in modelling and constraining the detection process with consistent and reliable results.