The present disclosure relates to controlling data storage and retrieval in database management systems.
Many recent applications produce data streams rather than a persistent dataset. The nature of persistent dataset is that updates are infrequent and most data are queried repeatedly. In contrast, data streams are changing constantly through continuous generation of new elements. Data from the data streams needs to be at least partially (temporarily) stored to permit analysis. Hence, the amount of stored data can grow to be large. Furthermore, more recent information provided by the data is generally considered to be more important for applications, where, for example, queries may be executed more often on recent information.
Traditional database management systems (DBMS) are originally and especially designed for storing a persistent dataset. Many assumptions are no longer compatible with the characteristics of data streams. In data streams, relations between information are not essential. It is more important that the storage systems provide the ability to store and retrieve a huge amount of data. There has been introduced storage systems which are able to manage a huge amount of data; The term “BigData stores” is coined to refer to such storage systems. They focus on the availability and scalability, while relaxing consistency.