It is common to run complex software applications on a distributed computing platform that comprises several machines, herein referred to as nodes, each node running a portion of the applications. The nodes communicate with one another through communication links using ATM or any other suitable communication protocol. Distributed processing is a powerful way to make software scaleable and to allow a single system to have components in multiple locations. Many standards for distributed processing exist and are commonly used. As a specific example, the Common Software Object Request Broker Architecture (CORBA) provides a basic mechanism for transparently distributing software objects across multiple network nodes. CORBA software objects may be written in a variety of programming languages, and may run on a variety of computer processor types. The communication, and any needed translation between the different data formats of varying processor types and programming languages, is handled automatically by CORBA. CORBA is a standard defined by the Object Management Group (OMG) For more information on the CORBA standard the reader is invited to consult "The Essential CORBA: Systems Integration Using Distributed Objects", by Thomas J. Mowbray and Ron Zahavi, Object Management Group Inc, 1998, whose content is hereby incorporated by reference.
There are large numbers of software applications running in distributed networks. Of particular interest here is asynchronous software whose software objects communicate by exchanging independent messages. In a typical asynchronous interaction, one object A may send a message to another object B. Having sent its message, object A may continue processing, or may wait to receive messages. Though object A may expect object B to return a reply message, object A remains responsive to messages from any object in the system. In contrast, in a synchronous distributed software system a request message and its reply are treated as an indivisible whole, equivalent to a procedure call. If an object A communicates synchronously with an object B, a simple implementation of object A remains unresponsive to messages from other sources until it has received a reply from object B. If object A must remain responsive to messages from other sources while waiting for a synchronous interaction to complete, object A must make use of more than one processing thread. The use of multiple threads complicates the design of object A and requires additional system resources.
In a typical asynchronous interaction, a plurality of objects operate using a certain processing thread. The control of the certain thread is given to objects one at the time, typically when a new message is received. When a given object has control of the thread, the object performs its functions. Some of the functions require objects to send messages to other objects in the system. Typically, when a certain object is waiting for an answer to a message, the certain object relinquishes control of the thread in order to allow other messages to be processed either by this certain object or by other objects which share the certain processing thread. This style of operation of objects, which is commonly referred to as asynchronous run-to-completion operation, is commonly used because it makes efficient use of system resources.
In order to refer to individual software objects in a distributed software system, a persistent name is given to each software object. The persistent name allows software objects to communicate with other software objects in the software system irrespective of the software objects' location in the system. To communicate with a software object, the location of that software object in the network must be resolved into an actual software object reference. The software object reference is the software reference of the software object. Amongst other information, the software object reference may comprise information about the node on which the software object is located and other information to help locate the software object in the system. A Name Service unit resolves a software object's persistent name into a software object reference that can be used to communicate with that software object.
The Name Service is an entity which is located at a well-known location in the distributed system and which keeps an entry for visible software objects in the system. Upon initialization, software objects in the software system register with the Name Service by sending their respective persistent name, software object reference, scope of registration and other relevant information to the Name Service in the form of a registration message. Typically, the Name Service can be a simple look-up table with a set of functions or methods to interact with the table. The Name Service tables may be distributed in the system. A particular example of a distributed Name Service includes a Central Name Service table, herein designated as Central Name Service, having entries for each visible software object in the system and subservient Name Service tables containing subsets of these entries. Generally, the subservient Name Services are associated to individual nodes (Local Name Service) or groups of nodes (Cluster Name Service). This particular Name Service structure will be used for the purpose of example through this specification.
When sending a message toward a remote location, a software object queries the Name Service to obtain the software object reference of the message recipient. In a typical interaction, a software object in the system, herein referred to as a client, sends a message to the Name Service requesting the location of another software object. The message contains the persistent name of the software object sought. The Local Name Service upon receipt of the message attempts to locate the software object reference corresponding to the persistent name. If the software object reference is in its local table, the Local Name Service sends a reply message to the client containing the software object reference of the entity sought. Occasionally, the entity sought will not have an entry in the Local Name Service contacted by the client. In this situation, typically the Local Name Service will send a message to an off-node Name Service asking for the software object reference of the software object sought. The off-node Name Service locates the software object reference corresponding to the persistent name and sends an acknowledgement message to the Local Name Service along with the software object reference. The Local Name Service receives the acknowledgement message from the off-node Name Service and sends an acknowledgement containing the software object reference to the client. The total elapsed time for the look-up operation is highly variable and may be very long if the request must be passed to many different off-node components of the Name Service before it is satisfied.
In the above-described process, during the look-up process, the client may behave in one of two fashions. In the first fashion, the client may treat the look up operation in a synchronous fashion and wait for a reply to this message to come. During this time, the processing thread cannot be used to process any other message. If the client uses only one processing thread this results in idle time and delays in the processing operations. If the client uses additional threads in order to remain responsive while waiting for the reply, the client consumes additional memory and processing resources and requires significantly more complex code to co-ordinate the actions of its multiple threads.
A possible solution for reducing delays for off-node synchronous look-up requests is to put complete copies of the Name Service table on every node in the system. This eliminates the need for off-node look-ups thereby providing a substantially reduced delay time for certain look-up requests. However, this option increases substantially the amount of memory required to store the Name Service on each node as well as increases the number of registration messages being sent every time a new software object registers with the system. Finally, the time required for searching the data in the Name Service increases as the size of the Name Service table increases since more entries need to be processed.
In another fashion, the client may treat each look-up with the Name Service as explicitly asynchronous. In a typical interaction, a client sends a message to the Name Service, starts working on other tasks and later receives the look-up result as a normal message on the client's message queue. This approach allows the Name Service to make any number of remote look-ups before eventually returning a result (success or failure) to its client. The disadvantage is that it complicates the programming of clients. The client must explicitly save the context of the query message sent to the Name Service, in such a way that when a reply eventually arrives from the Name Service, it will be possible to resume the interrupted processing. For each message that requires a Name Service look-up (which may be most messages) the client must save its context twice, once for a message to the Name Service and once for the desired message to the external object. This adds significant complexity in the code of the client, reducing the reliability and maintainability of the code.
Thus, there exists a need in the industry to provide an improved method and Name Service for locating a software object in an asynchronous distributed software system with the simplicity of synchronous interaction and a uniform response time comparable to a local Name Service look-up.