1. Field of Invention
The present invention on relates to a distributed file processor and distributed file processing method. In detail, the present invention relates to a device and method of processing distributed files, which enables users to gain efficient access to distributed data that exists in, for example, network servers through a data communication network with plural computers connected, and also to extract processing data quickly.
2. Description of the Related Art
Recently, in the field of the distributed file access system through a network, demands for higher speed and accuracy have been increasing. Various distributed file processing systems have been proposed and tested. The following are explanations of some of the typical systems.
Related Art 1
As one of the general conventional distributed file access systems, there is a system with which a client gains access to a file through a proxy server or relay server (hereinafter called proxy server), distributed a long a network path connecting a file server that provides files and a client server that accesses the files. The FTP mirror server is one such example.
Such a system using a proxy server has an advantage that lowers the load on the file server and reduces communication traffic on the network path by providing duplicates of the file in advance.
Also, the snapshot system has been devised which provides a duplicated file to subsequent access by dynamically creating the duplicate of a file accessed by a client, without having the duplicated file accommodated in the server beforehand. For example, Squid, the HTTP relay server program mentioned in RFC2187 executes updating and nullifying of a duplicate when it judges that they are necessary, by a calculation based on the time stamp information such as the generation time of a duplicate or the original file.
In this type of the snapshot system, the management of a file duplicate, for example, is implemented in the following procedure. First, when a client requests access to a file, and a duplicate of the requested file exists in the proxy server, the calculation based on the time stamp information mentioned above is executed, and the freshness of the duplicate, is judged. Other than when the duplicate is adequately fresh, the original time stamp of the requested file and that of the duplicate are compared, and if the duplicate is fresher, the program returns the duplicate. If the duplicate is older, it is nullified, and the original file is copied again to create and return a new duplicate. In this sort of method creating duplicates dynamically, the management of duplicates can be automated and hence the management cost can be reduced, and the expansion of proxy servers becomes easier, compared to the system which accommodates duplicates beforehand.
However, a file created in on-demand type on a file server at the time of access, represented by CGI of WWW, (for example, when including today""s weather into information of a file), the time stamp information is added when it is accessed. With a system which manages duplicates by only the time stamp information, the validity of a duplicate cannot be judged because there is not a mechanism to decide an appropriate time stamp that compares the duplicate and the original. Therefore, it is difficult to efficiently utilize duplicates as for the files that are created dynamically.
This kind of technique will be hereinafter called the Related Art 1.
Related Art 2
In the Japanese Published Unexamined Patent Application No. Hei 8-292910 are disclosed a resource management device and the method of the same that interpret a resource name qualified and configured by the resource name of a raw material, a procedure name representing a procedure to process the raw material, a parameter corresponding to the procedure, and a contextual identifier representing the device that interprets the procedure name and starts the procedure, execute access to the resource name of a raw material, start the procedure, and process the resource name of a raw material.
This resource management device and the method disclosed in the Japanese Published Unexamined Patent Application No. Hei 8-292910 resolute to decentralize the procedure name and raw material resource name embedded in the resource name, and combine the procedures like a pipeline on the network. However, there is a flaw of being unable to acquire the combined resources which is the fruit of the final process, when a part of communication fails by a communication error such as a burst on a network, considering that a procedure can be combined in multiple steps or if it has multiple entries, the pipeline will be made up in like a tree configuration. When communication fails and does not generate a result, the result could be acquired by a rerun. But to acquire one combined resource, plural times of network communication are required. Accordingly, on a network with low reliability, there is a high probability of failing in any of the communication which would require repeating numbers of rerun until acquisition of the result, or even not being able to acquire the combined resource at all.
This kind of technique will hereinafter be called the Related Art 2.
To consider visualizing the reduction of load to file servers and communication traffic on the network path by effectively utilizing created duplicates in the distributed file processing, the construction of the conventional techniques mentioned-above could cause problems explained below.
First, in the Related Art 1, there is a difficulty in handling duplicates for documents generated dynamically on the server at the request of a client, though it provides some effects such as reducing the communication traffic utilizing the duplicate at times of access of WWW mentioned above.
Also, the Related Art 2 achieves the distributed file processing by means of the procedure of distributed file access, and will be able to provide a service combining various kinds of processes, but since the result of process requires the multiple combinations of pipelines, there is a high probability of failure on an unreliable network, and also the problem of low practicability.
The present invention has been made in view of these problems that the foregoing conventional techniques hold, and it is an object of the present invention to reduce the load to file servers and communication traffic on the paths by effectively utilizing duplicates of the files.
In order to accomplish the foregoing object, the distributed file processor in a distributed computer system with computers connected through a network relating to the present invention comprises: a context unit that interprets a qualified file name configured by qualifying a raw material file name identifiable of a data file raw material held in the processor connected to the network with a procedure name representing a procedure to edit and process the raw material, a parameter of the procedure, the name of a computer that interprets the procedure name and operates the procedure, and a context name that defines operational environments, fetches the raw material file name and the procedure name, inputs data of a raw material file corresponding to the fetched raw material file name, starts a procedure corresponding to the fetched procedure name, and implements a process by the procedure having started the raw material file; a process result holding unit that holds a process result of the raw material file processed by the procedure started by the context unit; a result outputting unit that outputs to fetch the process result held by the process result holding unit; and a process result control unit that judges the validity of the process result held by the process result holding unit.
Further, in the distributed file processor of the present invention, the process result control unit comprises a configuration that judges the validity of the process result on the basis of the raw material file name corresponding to the process result held by the process result holding unit and the procedure name executed in the context unit.
Further, in the distributed file processor of the present invention, the process result control unit comprises a configuration that judges the validity of the process result on the basis of either a process execution date and time of the procedure process by the context unit, or a process request identifier inherent to a request that demanded to execute the procedure process by the context unit.
Further, in the distributed file processor of the present invention, the process result control unit comprises a configuration that compares an updating date and time of the raw material file with a process execution date and time of the procedure process by the context unit, and judges the validity of the process result on the basis of the comparison result.
Further, in the distributed file processor of the present invention, when the context unit judges that a process time specifier of the requested process result is included in the interpretation of the qualified file name, the process result control unit comprises a configuration that judges the validity of the process result associated with the process date and time based on whether or not the process date and time corresponds to a rage of time indicated by the process time specifier.
Further, in the distributed file processor of the present invention, the network comprises a configuration that connects a plurality of context units, a plurality of process result holding units provided correspondingly to the plurality of context units, and a plurality of process result control units provided correspondingly to the plurality of process result holding units, in which a plurality of the context units are each capable of the procedure process individually, and a first process result control unit that judges the validity of the process result corresponding to a process request outputted on the network comprises a configuration that executes a query as for the validity of the process result to a second process result control unit that controls a second process result holding unit on the network which holds the process result corresponding to the raw material file and the procedure specified by the process request.
Further, in the distributed file processor of the present invention, the process result control unit comprises a configuration that executes a process of either invalidation or cancellation of the process result stored in the process result holding unit, on the condition of the analysis that a specific parameter indicating a designation of either invalidation or cancellation of the process result held in the process result holding unit is included in the interpretation of the qualified file name by the context unit.
Further, in the distributed file processor of the present invention, accompanied with the execution of a process by the process result control unit, which relates to either invalidation or cancellation of the process result stored in the process result holding unit, the context unit comprises a configuration that acquires the raw material file corresponding to the process result in which the invalidation or cancellation has been executed, and executes a new process to hold a process result thereof in the process result holding unit.
Further, in the distributed file processor of the present invention, a process according to the procedure to the raw material file in the context unit is executed as an asynchronous process independent from a timing of a process request, a process result identifier is generated to a process result processed by the procedure, which corresponds to the process result, and the process result identifier is held in the process result holding unit to be associated with the process result.
Further, in the distributed file processor of the present invention, the context unit comprises a configuration that executes the procedure to the raw material file as an asynchronous process independent from the timing of a process request, on the condition of the judgment that specifying data indicating the process request being asynchronous is included in the interpretation of the qualified file name.
Further, in the distributed file processor of the present invention, when the context unit judges that a process result that does not completely coincide but partially coincides with a condition of a process request is included in the process result holding unit in the interpretation of the qualified file name, the result output unit comprises a configuration that outputs the process result having the partial coincidence included in the process result holding unit.
And, the distributed file processing method in a distributed computer system with computers connected through a network relating to the present invention comprises: a procedure processing step by a context unit that interprets a qualified file name configured by qualifying a raw material file name identifiable of a data file raw material held in the processor connected to the network with a procedure name representing a procedure to edit and process the raw material, a parameter of the procedure, the name of a computer that interprets the procedure name and operates the procedure, and a context name that defines operational environments, fetches the raw material file name and the procedure name, inputs data of a raw material file corresponding to the fetched raw material file name, starts a procedure corresponding to the fetched procedure name, and implements a process by the procedure having started the raw material file; a result holding step that holds in the process result holding unit a process result of the raw material file processed by the procedure started in the procedure processing step; a process result validity judgment step that judges the validity of the process result held by the process result holding unit; and a result outputting step that outputs to fetch the process result held by the process result holding unit, which is the process result that is judged as valid in the process result validity judgment step.
Further, in the distributed file processing method of the present invention, the process result validity judgment step judges the validity of the process result on the basis of the raw material file name corresponding to the process result held by the process result holding unit and the procedure name executed in the context unit.
Further, in the distributed file processing method of the present invention, the process result validity judgment step judges the validity of the process result on the basis of either a process execution date and time of the procedure process by the context unit, or a process request identifier inherent to a request that demanded to execute the procedure process by the context unit.
Further, in the distributed file processing method of the present invention, the process result validity judgment step compares an updating date and time of the raw material file with a process execution date and time of the procedure process by the context unit, and judges the validity of the process result on the basis of the comparison result.
Further, in the distributed file processing method of the present invention, a procedure process in the context unit to the raw material file is executed as an asynchronous process independent from the timing of a process request, a process result identifier is generated to a process result processed by the procedure, which corresponds to the process result, and the process result identifier is held in the process result holding unit to be associated with the process result.