This invention relates to caches and, more particularly, to methods and systems for providing a scalable, synchronized data cache using partitioned data and invalidation triggers.
Online transaction processing involves systems for conducting transactions, such as stock trades or electronic payments, over a network. A common architecture to support online transactions includes a user at a computer in communication with a server via a network. When the user sends a request for a transaction to the server, the server fulfills the request and, in some configurations, returns a confirmation to the user. To process the transaction, the server may access data about an account of the user or products that are available to the request. The server may retrieve such data from a persistent data source, such as a database.
As more users are added to such a system, additional servers can be added to process the increased number of requests. However, when these servers must access the same data source to fulfill the requests, a delay can result if one server must wait while another server accesses the data source. As the system grows and servers are added, the delays are compounded and system performance suffers. As the time required per transaction grows, users may become frustrated and stop using the online transaction processing system.
A common solution to this problem is to employ a data cache to store a temporary copy of data from the data source. If multiple servers are used, each can maintain a cache of data retrieved from the data source. To process a request, a server would first search its data cache for the required data. If the required data is found in the cache, the server can process the request without accessing the data source. Only if the required data is not found in the cache must the server access the shared data source. Once the required data is retrieved from the data source, the server can store the data in its data cache for future use. CORBA and Enterprise JavaBeans (EJB) are well-known architectures that provide support for such a distributed transaction processing system.
Even with the addition of data caching, traditional online transaction systems suffer problems with scalability and latency. As a system grows very large and employs many servers, delays will occur as those servers must access the data source to obtain data not found in a data cache. Furthermore, successful data caching relies on the accuracy of the data stored in a data cache. When a data item changes in the data source, delays in communicating the change to each data cache cause errors if a server uses outdated data to process a request.
It is therefore desirable to provide a data cache that is highly scalable so that performance does not degrade as the size of a data source increases. It is also desirable to provide a data cache that is synchronous with the data source so that there is no discrepancy between data in the data source and the available in the cache.
Methods and systems consistent with the present invention provide a data caching technique that is highly scalable while being synchronous with a persistent data source, such as a database management system. Consistent with the present invention, data is partitioned by, for example, account, so that a data cache stores mostly unique information and receives only invalidation messages necessary to maintain that data cache.
In accordance with an aspect of the invention, a system is provided to process transactions for a user. The system includes at least one application server that receives a query including an account number from the user via a request distributor, processes the query to determine a result, and returns the result to the user via the request distributor. The system further includes at least one data store configured to store account data corresponding to the account number in a table. The system further includes at least one data cache that maintains a cache partition corresponding to the account number, and, in response to the query processed by the at least one application server, searches for the result in the cache partition. If the result is not found in the cache partition, the data cache obtains the result from the data store, stores the result in the cache partition, and returns the result to the application server.
According to the present invention, a data processing method maintains set of cache partitions, each identified by an account number. When a query including a value is received, a particular cache partition corresponding to the query is identified from among the set of cache partitions, based on a relationship between the value of the query and the account number used to identify each cache partition in the set. A result to the query is provided based on the determination.
Additional features of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.