Stored data, such as data stored and maintained by a Relational Database Management System (DBMS) are able to have increased flexibility in accessing stored data by maintaining indices into the stored data. An index into a stored dataset often does not allow accessing data according to the order in which the data is stored in the dataset. Performing multiple scanning operations through an index of a dataset therefore requires non-sequential access to the dataset. Determination of future data accesses in an index scan is difficult since the index structure may not be transparent to application programs and data caching determinations and makes effective reuse of cached data for indexed data scanning.
Some database processing applications include database architectures that try to maximize reuse of partial query results from the query down to the page access level. These architectures achieve this by detecting overlaps in active query plan operators at query execution time and then exploit it by pipelining one operator's results to all dependent operators where possible. Two operators that are able to incorporate this architecture are table scan and index scan operators. For these operators, one scan thread executes that keeps scanning all pages while table scan operators can attach to and detach from this thread in order to share the scanned pages. While this approach works well for scans with similar speeds, in practice scan speeds can vary by large margins and even single scans' speeds are usually far from constant due to changes in predicate evaluation overhead. Therefore, the benefit can be lower as scans may start drifting apart. Techniques to prevent drift by automatically throttling faster scans and by scan-group based prioritization of buffer pages are generally applicable for table scans only. In addition to cache or page buffer algorithm improvements, other methods to reduce disk access costs for multiple concurrent queries with overlapping data accesses have been investigated. These methods include multi-query optimization (which requires all queries to be known in advance) and query result caching. Due to being at a high level of the query execution hierarchy, the latter may miss out on sharing potential for queries that have very different predicates but still end up performing scans on the same table, for example. Smarter buffer managers may be used to optimize page replacement under multiple running queries in order to maximize buffer locality. Such approaches require significant modifications of the caching system.
Therefore a need exists to overcome the problems with the prior art as discussed above.