Time-series data, which is data stored in a format typically containing a time column corresponding to an associated data column, can be indexed by, for example, fully inverted indexes that list all occurrences of each value in the data column.
The time column can also benefit from simple, but memory intensive, index structures. Linear Run Length Encoding (RLE), which combines continuous runs of values, can be scanned for a specific value or a range of values by looking at each run instead of looking at each value. Inside a run, which is defined by its start position, start value and length, the position of an included value can be calculated by adding the run's start position to the value's position inside the run, which is given as the difference between the value itself and the run's start value.
A well-sorted time series contains exactly one run per series in the time column. Thus, a value of interest can be easily found in each time series, if contained. Accordingly, an additional index is not needed. However, in practice data is typically loaded bulk-wise, for example, in groups where the time column entries repeat or reset. In this case, the data contains more than one run. Depending on the bulk load frequency the number of runs in the time column can arbitrary increase.