Determining a classification associated with an entity, such as a person, an organization, an object, or an organization can have tremendous importance and numerous applications. For example, if upon admission to a hospital, a person can be classified as having certain risk factors for developing certain diseases, the risk factors can be used during diagnosis of that person's medical conditions. Of a similar value can be the prediction of the class label of an entity or an object in the future, with the knowledge of the future predicted class label to allow for the planning of the future (e.g., forecasting tasks). Such tasks are commonly accomplished by separate families of techniques. For example, traditional time series forecasting focuses on predicting the value of a future point in a time series. Similarly, one of the goals of relational learning, also known as statistical relational learning, is classifying an object based on the object's attributes and relations to other objects.
While the two families of techniques can be applied to data represented as a graph, the techniques have drawbacks that limit their usefulness. For instance, traditional time series forecasting techniques, such as those described in Box, G. E., G. M. Jenkins, and G. C. Reinsel. “Time Series Analysis: Forecasting and Control.” John Wiley & Sons (2013), the disclosure of which is incorporated by reference, only consider a single time series. In the context of data represented as a graph, such techniques consider only a single node of the graph, representing a single entity, without considering edges that represent the connections of that entity to other entities. In other words, these techniques assume independence among the time series. Multiple possible reasons exist for this approach, such as the amount of observed data being limited and only a single time series being available. Further, in many situations, the dependence between the time series is unknown or unobservable. For example, such dependence may not be observable when data points in a time series are collected independently from each other, such as when the data points represent distinct variables such as wind speed and temperature.
Likewise, traditional multivariate time series forecasting techniques, which account for interrelatedness of time series, also have limited use. Most of the existing models are based on a fundamental assumption that the time series being processed are pairwise dependent or strongly correlated with each other. Thus, these models assume that the each of the time series represents a node in a graph and each node has an edge to every other node in the graph, forming a clique of the size of the number of nodes in the graph. When the assumption is incorrect, the results produced by such techniques can be inaccurate.
On the other hand, statistical relational learning techniques, such as those described by Taskar, Ben, and Lise Getoor “Introduction to statistical relational learning,” MIT Press (2007) and Rossi, Ryan A., et al. “Transforming graph data for statistical relational learning.” Journal of Artificial Intelligence Research 45.1 (2012): 363-441, the disclosures of which are incorporated by reference, generally focus on static graphs, graphs representing connections between entities at a single time point and ignore any temporal relational information. Such techniques cannot predict a future classification of an entity represented by a node in a graph.
Accordingly, there is a need for a way to be able to assign a classification at multiple time points to a data item included as part of multiple type of graphs. There is a further need for improved ways to perform relational and non-relational classification of data items.