The approaches described in this section could be pursued, but are not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated herein, the approaches described in this section are not prior art to the claims in this application and are not admitted to be prior art by inclusion in this section.
Geographically dispersed enterprises often deploy distributed computer systems in order to enable information sharing throughout the enterprise. Such distributed systems generally comprise an enterprise network, which includes a number of Local Area Networks (LANs) that are connected over one or more Wide Area Network (WAN) communication links. An enterprise network generally includes one or more servers that store enterprise information in one or more data resources. The servers supply the data from the resources upon receiving requests for the data from other enterprise servers or clients, which servers or clients may be established in the LAN that is local to the data resource or may be established in a LAN that is located across the WAN.
For example, the business structure of an enterprise may comprise a main office, and one or more branch offices. To support this business structure, the enterprise typically employs a local LAN for each of the main and branch offices, and one or more WAN communication links that connect the LANs at the branch offices with the LAN at the main office. This network infrastructure enables users at the branch offices, who run software applications locally on their workstations, to access files that are located at the main office.
While this network infrastructure allows for greater sharing of information for users throughout the enterprise, it also has a significant disadvantage because software applications that access data resources are primarily designed to access the data resources over a relatively high-speed LAN. Usually, significant latency and performance degradation are observed when a software application accesses a data resource that is located across the WAN in a remote LAN. In a typical example, an enterprise user in a branch office uses a word-processing application to access and modify files. Usually, operations on files that are in the LAN local to the user are relatively quick, while operations on files that are located across the WAN are relatively slow and sometimes unreliable.
One of the reasons for the above-described performance degradation is the high number of lock operations performed by a software application before the user is provided access to the requested data resource. The software application, usually in cooperation with a server or an operating system that manages the data resource, needs to perform the lock operations in order to provide greater sharing of the data resource while properly managing updates and changes to the resource. When the software application accesses the data resource over a LAN connection, it performs the lock operations relatively quickly because of the low latency and high connectivity speed of the connection, and the user usually gets fast response times. However, when the software application accesses the data resource over a slow WAN connection, the software application must perform each required lock operation remotely by sending sequences of messages over a slow communication link, and as a result the latency in response times for the user is relatively high. Even if the WAN link has sufficient bandwidth, the WAN latency still causes the lock message exchange to be slow.
For example, in order to open a file for read-only access, a Microsoft Word® word-processing application exchanges between thirteen and sixteen round trip lock operation messages with a file server that controls access to the file. When the client on which the user executes the word-processing application and the server which controls access to the file are coupled over a WAN connection, network latency may introduce an undesirable or even unacceptable delay in completing the file open operation because of the time needed to communicate so many lock operation messages. From the user's perspective, opening the file in the word-processing application takes too long. Similarly, in another example, saving a file with Microsoft Word® involves between forty and fifty round trip lock operation messages between the client and the file server that controls access to the file. In both examples, the lock request messages are synchronous, that is, the client does not send the next lock request message before the response from the previous lock request message has arrived, thus creating an unacceptable delay. Furthermore, during the time that the application is waiting for exchange of lock messages to occur, the network link is underutilized by the application, further delaying the transfer of data between the client and the server.
One general approach to improve the response times for performing operations on a remote data resource involves the technique of data replication. Replication entails maintaining multiple identical copies, also referred to as replicas, of the data resource in multiple locations throughout the enterprise network. Requests from clients for the data resource are manually or automatically re-directed to the local or topologically closest replica of the data resource.
The principal disadvantage of replication is that it requires high bandwidth network connections to maintain all the replicas up-to-date and to ensure consistency between the replicas. Another disadvantage is that the number and size of the required replicas and the complexity and size of the network often limit the level of consistency that can be achieved between the replicas. Yet another disadvantage of replication is the “loss of single copy semantics.” In other words, when clients change different replicas, or use locks to coordinate access to different replicas, the single data resource operations semantics are lost because there is no single master copy of the data resource. Furthermore, this approach does not address the specific problem created by software applications that require the performance of a high number of lock operations over a slow communication link before providing access to a data resource.
Based on the foregoing, there is a clear need for a technique providing lock optimization and lock prediction to reduce the number lock operation messages exchanged over a slow communication link in relation to file operations.