Field
Embodiments described herein generally relate to managing temporal data, and more particularly to supporting a time dimension for managing temporal data.
Background
Managing and analyzing data in its current form and with its history is called temporal data management. Temporal data management is of increasing importance for business analytics to enable use of past data to predict future trends. However, support for temporal queries in commercial databases remain constrained and has not yet reached the same level of maturity available for conventional queries.
This may be due to a number of challenges. First, adding a temporal dimension increases the amount of data that needs to be managed as older versions of data are not overwritten, but appended. Second, there is often a decrease in performance of queries when the history of data is growing. Third, the representation of time as intervals of varying size can lead to complex and hard-to-evaluate expressions.
At a physical storage layout level, there is no obvious order in which data can be arranged as at least two dimensions of sorting are needed for starting and ending times. Since temporal predicates are often not very selective, using standard multi-dimensional indexes (for example, R-trees) may not be a viable option. Similarly, ad-hoc resorting or replicating data with different orderings is also not helpful due to limitations in supporting both dimensions efficiently in a same query and the associated overhead.
Data which is ordered physically by transaction start time provides less expensive update cost and can also support queries in the time dimension. But, it may not support common optimizations in column stores based on resorting and compression. Further, the existing data structures for temporal data provide only partial solutions. The existing data structures may have been developed to support a single type of temporal query such as time travel, temporal aggregation or temporal join, and may require a different data structure based on the aggregation function. Furthermore, most of these data structures have been designed with disk-based row-stores in mind, optimizing for block I/O and constraining themselves to variants of transaction start time order.