Embodiments of the present invention relate to memory management, and more specifically to techniques for spilling data from memory to a persistent store based upon an evict policy.
In applications such as stock quote monitoring, automobile traffic monitoring, and data sensing, data is typically generated in the form of a stream of events over time. A data stream, also referred to as an event stream, is a real-time, continuous, sequence of events. Examples of sources that generate data streams include sensors and probes (e.g., RFID sensors, temperature sensors, etc.) configured to send a sequence of sensor readings, financial tickers, network monitoring and traffic management applications sending network status, click stream analysis tools, and others. The term “events” are used interchangeably with “tuples”. As used herein, tuples of a stream have the same set of attributes. Each tuple is also associated with a particular time. A tuple may be considered to be logically similar to a single row or record in a relational database.
Processing the data streams is often referred to as “stream processing.” The data streams may be processed to detect complex patterns, event correlations, relationships between events, etc. For example, a sensor placed at a particular section of a highway may output a data stream comprising information detected by the sensor about automobiles that pass the particular section. A data stream output by the sensor may include information such as the type of automobile, the speed of the automobile, the time that the automobile was on the particular section, and other like information. This data stream may then be processed to determine heavy traffic congestion conditions (indicated by slow average speeds of automobiles), and other traffic related conditions or patterns.
In traditional database systems data is stored in a database, for example in tables in a database. The data stored in a database represents a bounded finite data set against which queries and other data manipulation operations may be performed using a data management language such as SQL. SQL and other traditional database management tools and algorithms are designed based upon the assumption that the tools and algorithms are executed against a finite, collection of data. Such traditional tools and algorithms are not conducive for handling data streams, as described above, due to the possibly continuous and unbounded nature of data received via the data streams. Further, storing event data in a table is impractical due to the large amounts of data that is continually received and the fast frequency at which the data may be received. Due to the ever increasing number of applications that transmit data in the form of a data stream, the ability to process such data streams has become important.