1. Field of the Invention
The present invention relates to data structures, computer systems, methods, and computer programs for searching a database.
2. Description of Related Art
In a typical database system, data processing includes reference, insertion, update, and deletion. However, in an insert-only database system, insertion, update, and deletion of the data processing are all expressed as insertion processing. In the insert-only database system, when the insertion processing including insertion, update, and deletion of data in the database is executed, data in the database is not directly rewritten and is newly inserted into the database in a log-like format. Thus, data in the database is stored in a format which allows the user to view a history of past data processing.
Each value stored in the database in a format which allows the user to view a history of past data processing is identified by an ID and a time stamp associated with the value. Each time stamp indicates a time when each value becomes valid. The phrase “value becomes valid” means that the value becomes referable. When a future time is set as a time stamp, a value that becomes referable at the future time can be inserted into the database before the future time is reached.
The insert-only database system is a database system that is seen, for example, in an account system, which is required to process a large volume of transactions at high speed.
As an example of techniques for speeding up data access processing included in a transaction, there is a deferred update in which an update transaction is committed when a system completes writing a transaction log. The system actually reflects all updated data in a database together, asynchronously to execution of the transaction. The deferred update is known to improve throughput of update transactions.
In reference transactions, there is a known technique in which search results are cached in an application. The caching reduces the number of times of access processing, the number of times being a bottleneck in processing time involved in accessing a database. As a result, latency in data access processing can be reduced.
When a reference transaction including reference processing is executed in an insert-only database system, it is necessary to designate a time stamp. The most frequently issued query in reference transactions is a query that returns the latest value among currently valid data at the time. Therefore, since the current time is often designated as the time stamp, the value of the time stamp is changed to the current time for each query. As a result, even if the previous query result is cached and a value to be returned is the same as the previous query result, a difference in time stamp leads to a determination that a cache error has occurred. Thus, even when a value to be obtained is stored in a cache, the value cannot be obtained from the cache and access to the database takes place each time.
The above-described determination that a cache error has occurred is made, because a time stamp field is not specifically recognized as a time data field and is recognized as merely a data space.
SQL shown below is an example of a query indicating that a time stamp field is recognized as merely a data space. In this example, an ID and a time stamp (EffectiveDate) are designated to obtain data from a table CD. In response to the designation, the latest value at the time point of the designated time stamp is returned.
SELECT * FROM CD
WHERE ID=? AND EffectiveDate<=?
ORDER BY EffectiveDate DESC FETCH FIRST 1 ROW ONLY
In the SQL described above, the time stamp field is not treated as a time data space and is treated in the same manner as that for a data space for other data, such as an ID.
Japanese Unexamined Patent Application Publication No. 2006-511876 describes an effective caching technique used for an SQL range search in an edge server. This caching technique checks a containment relationship between designated ranges of queries to reuse a reusable cache. For example, assume that a result of the query
SELECT employee.id FROM employee WHERE employee.age<25 is stored in a cache. In this case, if the query
SELECT employee.id FROM employee WHERE employee.age<30 is issued, data satisfying the range condition, employee.age<25, is obtained from the cache and data satisfying the range conditions, employee.age>=25 and employee.age<30, is obtained from a database.
In the caching technique described above, a determination as to whether each data stored in the cache is the latest data is made using a time stamp. If a query that includes, as a search range, data cached later than a time when each data was cached is issued, cached data is verified. However, in the caching technique described above, a time stamp is used merely to verify cached data and is not used to process a designated query at high speed.
When a query is issued in a database system that designates a time stamp and issues a reference transaction, the current time is set as a time stamp for each query. Therefore, even when a result of the query is present in a cache and a value to be returned is the same as the cached result, a difference in time stamp leads to a determination that a cache error has occurred. As a result, data cannot be obtained from the cache and access to the database takes place each time a query is issued. Accordingly, there is a demand for a method that can speed up a transaction when time stamps are different as described above.