Event streams, which are sequences of discrete events over continuous time, are ubiquitous in many applications, such as system error logs and web search query logs. Learning a model for the temporal dependencies and relationships among events within a system can present a useful paradigm for better understanding the relationships in the system. For example, a system administrator of a datacenter may be interested in knowing how failures on certain machines or machine types can affect failures elsewhere in the datacenter for preemptive maintenance.
The inherent temporal nature of the problem leads to many algorithmic and statistical challenges. In many cases, the model is to learn long-range dependencies, where examining few events in a particular time interval can be insufficient. For example, what a user will query for at a particular time is not just a function of what he has queried for in the previous day, but also what he has previously queried for on that day of the week, as well as at that time of the day, and the topics he has queried for in the past. Similarly, for datacenter logs, the likelihood of a machine failing may depend on many other failures, warnings, and repairs that have accumulated over time, not simply what the machine's status was the day before the possible failure. Moreover, machines are not equally likely to fail, and each user is not equally likely to search for the same topics, so events from different machines or users may not be modeled well if they are assumed to behave identically given the same history.
While graphical models such as Bayesian networks and dependency networks are widely used to model the dependencies between variables, they do not model temporal dependencies. Dynamic Bayesian Networks (DBN) allow for the modeling of temporal dependencies in discrete time. However, it is not clear how continuous-time data could be discretized in order to apply the DBN approach. At a minimum, too slow a sampling rate results in poor representation of the data, and too fast a sampling rate increases the number of samples, making learning and inference more costly. In addition, allowing long term dependencies involves conditioning on multiple steps in the past, and choosing too fast a sampling rate increases the number of such steps that need to be conditioned on. Further, since the number of different event types is typically very large, these models can become quickly intractable.