With the development of technologies such as computer, data communication, real-time monitoring, time series databases have been applied in various aspects such as equipment monitoring, production line management, financial analysis and so on. A time sequence refers to a set of measured values arranged in chronological order. A time series database refers to a database for storing these measured values. Measured values may include various data. For example, in an application environment of bridge monitoring, collected data may include pressure intensity data collected by a certain sensor; in an application environment of weather forecasting, collected data may include temperature, humidity, pressure, wind force (e.g., including force and direction); and in a city's power grid monitoring system, the collected data may include measured values of power consumption of each household in the city, etc.
Generally speaking, as application environments differ, the data collecting frequency and the collection point amount might differ enormously. Regarding massive time series data, the following circumstances may exist: 1) data need to be collected at a high frequency, whereas the amount of measurement points is small; 2) data need to be collected at a low frequency, whereas the amount of measurement points is huge. Since the overall data amount being collected is a product of the collection frequency and the collection point amount, both of the above circumstances will result in massive time series data.
Specifically, in the application environment of monitoring bridge security, for example, it is possible to deploy sensors (e.g., in dozens) at important locations of the bridge, and collect pressure intensity data at each location of the bridge with frequency of 10 times per second; in the power grid monitoring system, it is possible to deploy sensors (e.g., tens of millions) at each household, and collect the power consumption of each household at a frequency of once every 15 minutes. Obviously, the data amount of time series data in conventional application environment is huge.
In addition, as application environments differ, operations performed to collected data also might differ. For example, query operations may exist besides inserting collected data into the database. Regarding the application environment of monitoring bridge security, a conventional query operation is history query that may query data collected from specific sensors within a certain time range (e.g., one hour). For another example, regarding the power grid monitoring system, a conventional query operation is slice query, i.e., when a power company makes statistics on the power consumption of each consumer, it is possible to query in parallel sensors deployed at various households during a shorter time range (e.g., a couple of minutes).
Usually, since time series data consist of massive data and the application of time series databases in all social sectors gets increasingly wider, it becomes a hot research issue regarding how to reduce resource overheads involved in storing and querying data and how to increase the storage and query efficiency of time series data.