A problem associated with current Client Cache Data Cache Systems is illustrated in FIG. 1. Data existing in a data depository 12 is accessed and distributed by a software application 14 resident within server 16. This software application 14 distributes data to various clients 18 (represented as 18′, 18″, 18″′, in FIG. 1), allowing multiple cliental users access data resident in data depository 12. The request for the data is created by applications 10, resident at individual clients 18 (e.g., 18′, 18″, 18″′, in FIG. 1). In order to decrease network traffic between server 16 and individual clients 18 (e.g., 18′, 18″, 18″′, in FIG. 1), it is desirable to store data within a local cache 22. Thus, when frequently accessed data is requested by application 20, redundant data requests to server 16 are eliminated as this data is accessed locally from cache 22. A problem exists when multiple clients access/modify data residing in their local cache 22. This creates a problem where multiple clients 18 (e.g., 18′, 18″, 18″′, in FIG. 1) may access divergent data or continue to access unsynchronized data from their local caches 22.
There are many systems for maintaining data coherency in cache memory. One such system periodically broadcasts a single invalidation report from the server to all of the clients. In this instance, the server and the client operate on the basis of a predetermined window. Periodically, within this window, the server broadcasts all data changes made since the beginning of the previous window. When clients are online, the clients receive the broadcast. If the client has been offline for at least one windowing period, the client invalidates the entire cache. Otherwise, only selected data is validated. It would be desirable to not depend on the use of broadcasting, be dependent on all changes originate from the server, and operate under the assumption of a predetermined, synchronized window of time.
Another method of addressing cache-coherency uses a set of consistency-action matrices. These matrices map consistency actions to the state of a particular cache entry. For example, when a cache entry is in a modify or exclusive state, it changes to having a matrix that ties write access attempts to a consistency action.
Another such method addresses the ability to view file contents on a remote machine and see changes occur to it as they happen.
The present invention addresses the issue of a client request for data from a file server where that file server transmits updates to clients over time. It would be desirable to have the server only transmit the data when requested, thus reducing network traffic. Furthermore, it would be desirable to address partial files and not just entire files. Furthermore, a protocol must be established by which this happens. This requires that a system of versioning (through the protocol) be established that successfully takes advantage of potentially stale but consistent pages.
Another method has been largely concerned with rapid cache propagation. Rapid cache propagation introduces the concept of distributed directories, which act as distributed coherency managers. These peer-to-peer mechanism lack identified protocols or versioning. Typically, server notifies one peer, which is then responsible for locating the correct distributed directory, which in turn must notify all clients.
A mechanism for maintaining a concurrent coherent database cache distributed across both database servers 16 and clients 18 (e.g., 18′, 18″, 18″, in FIG. 1), would be extremely valuable in addressing this problem. Such a mechanism would enable significant performance enhancements for client/server database read access by allowing remote clients 18 (e.g., 18′, 18″, 18″, in FIG. 1) to access, recently accessed, server data locally with no required network communication or server interaction, while providing a measure of protection against stale or latent data.
Modern database management systems also use portions of the main memory to hold or cache recently read data from disk storage. Differences in access time between local memory, and network storage systems in database operations necessitate that operations that can be satisfied with cache be satisfied at the local level. Cache access leads to dramatic performance improvements of both local applications and network performance.
Client/server applications offer many advantages for centralized data management and workload partitioning. However, this architecture as shown in 1 introduces a source of data access latency by separating the client application from the data repository by a network connection. In the case where disk access is required to satisfy client application request 24 the network latency imposes minimal penalty relative to the accessing the disk. However, for the cases where the request is satisfied from the network resources, network latency is significant.
Typically, any piece of data is not arrived at directly. Usually several links locate and direct users to the desired data. A simple example is when an individual looks in an index. The index in turn directs the user to a specific page, where the page contains the desired data. Therefore, at a minimum, two pieces of data are required to gather the desired data. These pieces of data are the location in the index that directs the user to a specific page, and the page containing the desired data. Users must verify the index and link in order to avoid any latency problems associated with data. Thus ensuring that the index or link points to the most current data (i.e., the data has not been altered or replaced). For notation purposes, this data, D1, may comprise D1A and D1B. D1A may be viewed as the index and D1B as the page containing the requested data. A latency problem arises when D1A is current and points to a non-existent piece of data D1B. Conversely, a stale D1A may point to stale or nonexistent data D1B. Thus, users are unable to properly access required data. In these cases, the link from D1A to D1B may be completely useless, causing data corruption. It would be extremely valuable to have a data system that guarantees data integrity. To further complicate matters, typically there are many more links and pieces of data required than the two shown in the previous example.
There are two issues not addressed within existing systems. First, it is not clear when locally cached data has become stale or compromised. Second, it is important to verify the consistency of a set of links, which comprise an individual piece of data.