The present invention relates generally to the field of time series databases, and more particularly to the process of data ingestion for time series databases.
A time series database (TSDB) is designed to handle “time series data.” Time series data generally takes the form of arrays of numbers indexed by time (for example, a datetime or a datetime range). TSDBs impose a model, based around time series data specifically, and this allows TSDBs to better handle time series data. The TSDB allows users to create, enumerate, update and destroy various time series and organize them in some fashion. These series may be: (i) organized hierarchically; and (ii) have companion metadata available with them. A TSDB server often supports a number of basic calculations that work on a series as a whole, such as multiplying, adding, or otherwise combining various time series into a new time series. TSDBs can also generally filter on arbitrary patterns defined by the day of the week, low value filters, high value filters, or even have the values of one series filter another. Some TSDBs also include additional statistical functions targeted to time series data. The process of taking data into a time series database is called “data ingestion,” or, more simply, “ingestion.”
“Sliding window” is a known technique used in various applications (for example, in data compression, the Data Link Layer (OSI model) and the Transmission Control Protocol (TCP)). A criterion is defined that quantifies the “window,” depending on the application. In compression algorithms, the window-defining criteria is typically based on a number of bytes, (for example, the window covers that last 64 kB (kilobytes) that were received or processed). In other known sliding window schemes, the window-defining criterion is based on a time interval (for example, the 2 hours prior to the most current data received). The start of the window, at any given point in time, is the most recent datum in the dimension on which exists the window defined by the window-defining criterion. A sliding window effectively partitions data into two classes: (i) data that are within the window criterion; and (ii) data that already have fallen out of the window criterion. Using the example of a sliding window based on the time of receipt of new data, the “end” of the window will advance because the window has a given temporal length which is defined to start at the time the most recent data received was received. Typically sliding window size is fixed, but it may be possible to have a sliding window that varies in size depending upon operating conditions and the like.