1. Field of the Invention
The present invention relates to data access in a file/object oriented network system. More particularly, the present invention is directed to a client-agent-server utility which increases the speed in which data in the form of files, objects and directories are accessed across slow link communications via remote node caching and provides verification, selective object compression, g selective prefetch and concatenation of fresh objects and indicators of cache correctness.
2. Related Art
Many operating systems are equipped to handle caching and verifying of data. Traditionally, in a remote clients caching system, optimization in retrieving data is limited to prefetching. In other words, an application program in a remote client requests from a file server transmission of a predetermined number of bytes of information (e.g., x bytes) and the operating system on the client prefetches the requested data plus another number of bytes of information (e.g., x+y bytes). Thus, when the application requests the bytes, it already exists in its readily accessible memory (cache).
In addition, there also exist problems with verification of directories in existing systems. It has been found, for example, that two remote clients concurrently accessing data and attempting to verify a directory will not necessarily obtain the same data due to the fact that the data from the file server computer will not necessarily send out the data in the same order to each of the remote clients. Thus, there is no clear indication whether the directory data is current.
In a desktop caching system, a high speed memory is used to cache data that is stored on a hard disk. While a desk-top cache program, such as Microsoft""s SmartDrive, is a useful tool to increase performance from the random access memory (RAM), this type of caching technique is not applicable to remote environments because of its inability to correctly handle multiple remote clients accessing the same data files concurrently, i.e., it is likely to corrupt the data.
File servers have employed caching techniques which parallel techniques of the desktop. Here, the file server deviates in protecting against multiple common data user access by implementing or providing a file locking service to clients.
Many object oriented network systems include web browsers which commonly manifest themselves on an object retrieval side of the remote client, such as Netscape""s Navigator or as Lotus Notes clients, and include web servers which commonly manifest themselves on the object server side, such as Notes servers, are equipped to maintain a cache of objects to avoid unnecessary retrieval of objects from a network of object providers. Cache correctness is determined through a given technique.
Many existing object oriented network systems employ inefficient data communication protocols to transfer object updates to replicas of an object collection. For example, during the replication process that takes place between a Lotus Notes(trademark) client and server each object update is requested separately which results in extra packet exchanges and inefficiency.
Existing object oriented network systems often employ a client-agent-server utility (the xe2x80x9cagentxe2x80x9d) to further reduce unnecessary retrieval of objects from a network of object provider. These agents are often termed as xe2x80x9cproxy serversxe2x80x9d since they retrieve objects from a network of object providers on behalf of a set of clients. In this situation, the agent maintains a cache of objects and monitors and responds to object retrieval requests from one or more remote clients. The agent may fulfill the request which emanates from a client by retrieving the object from its cache rather than forwarding the request to the network of object providers.
As shown in FIG. 1, the related art includes a remote client computer having an operating system (OS) with a file system interface (FSI). Operatively connected to the FSI is a local file system (LFS) which in turn is operatively connected to a RAM based disk cacher (RBDC), disk driver (DD) and permanent storage disk (PSD). The PSD may include object retrieval application cache (ORAC) and object collection Replicas (OCRs).
Object retrieval applications (ORAs) exist in the remote client which have the ability to retrieve objects and to store OCRs into the PSD via the LFS via the FSI. These OCRs are retrieved through an Object Retrival/Storage interface (ORSI) which employs an Object Retriever (OR).
Operatively connected to the FSI is a network file redirector (NFR) with prefetch capability, and a network transport layer (NTL) connected to a WAN driver. Aside from the OS, there exist application programs (AP) which employs the OS via the FSI. A communication server (CS) connects to the remote client computer and includes a WAN driver, routing layer and LAN driver. The CS connects through a LAN link to a file server computer.
The file/object server computer has an OS. The file/object server computer OS includes an NTL connected to a LAN driver and an FSI connected to LFS which in turn is connected to an RBDC, a DD and a PSD. Aside from the OS, there exists a file/object server application which employs the OS via the FSI.
An object proxy server (OPS) may also exist operatively connected to the communication server and the file object server. The OPS includes and ORSI, and OR, NTL, LAN driver, FSI, RBDC and DD as shown in FIG. 1. The OPS maintains an object cache for the purpose of maintaining an object cache on PSD via an FSI. The OPS retrieves objects via an ORSI which is operatively connected to an Object Retriever (OR).
A further problem associated with these prior systems is their inability to provide a remote client user with greater speed of access to object collection updates because of inefficient or xe2x80x9cchattyxe2x80x9d data communication protocols. This chattiness usually manifests itself in extra packet exchanges to accomplish the communication of the object collection updates by requesting each object update individually. In a satellite based communication link, latency is an important factor where the send/receive acknowledgment cycle of even the smallest data unit can take several seconds to accomplish.
The problem associated with these prior systems is their inability to provide a remote client user with greater speed of access to file/object server data and/or file/object server directories. This is especially so because of the type of link in which the remote client may be accessing the data through, such as a modem phone link. In the context of the present invention, xe2x80x9cremote clientxe2x80x9d is defined as a user, accessing data over a relatively slow link, such as a modem phone link. A typical modem phone link provides a transfer rate of about 28.8 kilobits of information per second. This is contrasted with a link in a LAN connection which can transfer at about 10 Megabits per second. These remote clients are thus greatly limited in speed of access.
The present invention overcomes the above described deficiencies which exist with remote clients accessing and verifying objects and data in files and directories from a file/object oriented network environment.
It is an object to increase the speed in which a remote client can access data and directories.
It is another object to maintain integrity of the accessed data and directory while increasing the speed in which the data is accessed.
A further object is to implement a cache verifying agent to act as a caching verifier between a remote client and a file server computer.
Still, another object is to add intelligence to a remote client in order to reduce the overall time in which a remote client accesses data.
Another object is to overcome the deficiencies of data transfer for a remote client.
Other objects and advantages will be readily apparent from reading the following description and viewing the drawings.
Accordingly, the present invention is directed to an apparatus for increased data access in a network, which includes a file/object server computer having a permanent storage memory, a cache verifying computer operably connected to the file/object server computer in a manner to form a network for rapidly transferring data, the cache verifying computer having an operating system, a first memory and a processor with means for performing an operation on data stored in the permanent storage memory of the file/object server computer to produce a signature of the data characteristic of one of a file, an object and directory, a remote client computer having an operating system, a first memory, a cache memory and a processor with means for performing an operation on data stored in the cache memory to produce a signature of the data, a communication server operably connected to the remote client computer, the cache verifying computer and the file/object server computer, and a comparator operably associated with the cache verifying computer and remote client computer for comparing the signatures of data with one another to determine whether the signature of data of the remote client is valid. The remote client computer includes means responsive to each comparison performed by the comparator on the data for generating and storing a validation ratio for the data in the first memory and for removing the data from the cache memory when the validation ratio drops below a predetermined value. The cache verifying computer includes means for recognizing a LOCK request from the remote client computer and for obtaining a lock on the data from the file server computer in response to the LOCK request.
The cache verifying computer includes the means for performing compression/decompression operations on data, means for recognizing a REPLICATION-SYNCHRONIZE request from the remote client computer and performing an analysis of data to be streamed back to the remote client computer to fulfill the REPLICATION-SYNCHRONIZE request, means associated with the recognizing means for determining and retrieving data associated with the data to be streamed, and means for storing data into permanent storage. The data can be file or object oriented.
Terminology
xe2x80x9cCachingxe2x80x9d is the function of retrieving an object from a relatively high speed storage device from a list of most-recently-used objects.
xe2x80x9cCachexe2x80x9d is a file which resides in permanent storage and contains the most-recently-used blocks of data read from a remote file/object server. xe2x80x9cDataxe2x80x9d referred to herein is inclusive of an object, directory and/or a file.
xe2x80x9cFile/object oriented distributed network,xe2x80x9d as used in the present invention, will include a network wherein the file/object server computer data is accessed via the following set of file system or object retrieval primitives: OPEN, CREATE, READ, WRITE, SEEK, LOCK, UNLOCK, CLOSE, DIRECTORY REQUEST, GET OBJECT, and SYNCHRONIZE COLLECTION REPLICATION.
xe2x80x9cFilexe2x80x9d means a collection of related data records treated as a basic unit of storage.
xe2x80x9cFile/Object Server Computerxe2x80x9d is a computer which includes a processor with its associated memory, an operating system, and a permanent storage memory.
A cached object is considered xe2x80x9cstalexe2x80x9d if it is found to be incoherent with the actual object as stored on the file/object server.
A cached object is considered xe2x80x9cfreshxe2x80x9d if it is found to be coherent with the actual object as stored on the object server.
A xe2x80x9cHandlexe2x80x9d is the internal address of a unique data structure that describes characteristics about a file, object, object collection or object database.
An xe2x80x9cObjectxe2x80x9d is a sequence of data of variable length.
An xe2x80x9cOpen Methodxe2x80x9d is an indicator of the actions that a program will take after opening a file or object database. The actions may be one or more of, but not limited to, read-only, write-only, open-for program execution only, open exclusively, open with the intention of locking regions prior to update, etc.
xe2x80x9cPermanent storage memory,xe2x80x9d as used herein, includes, but is not limited to, disk drive, flash RAM or bubble memory, for example.
xe2x80x9cReplicationxe2x80x9d is the process of exchanging modifications between replicas of a collection of objects.
A xe2x80x9cReverse Channelxe2x80x9d is the means by which a response message is sent over the same network layer interface in which a request was received.
A xe2x80x9cSub-objectxe2x80x9d is a portion of an Object.
A xe2x80x9cValidatorxe2x80x9d is a relatively short stream of data which is returned by an object server along with an object which is to be presented to the object server for purposes of validating the requestor""s object cache.
A xe2x80x9cchattyxe2x80x9d replication data communication protocol is one where extra packet exchanges are used to request each object update from a set of object collection updates individually.
xe2x80x9cStreamingxe2x80x9d is the method of concatenating a collection of objects into a larger object for the purposes of more efficient data communications by eliminating the overhead packets and communication latency associated with the transfer of objects on an individual basis.