It is well known how to construct distributed, object-oriented applications, components of which (i.e., the application's constituent objects), execute on different machines and communicate (i.e., exchange messages) across the machine boundaries. One such system is shown in FIG. 1, consisting of two machines M1, M2 and four processes A, B, C, D. Each process runs in a different address space in its respective host machine and includes one or more objects which perform the tasks associated with the process. For example, the process A includes three objects A1, A2, A3.
In a distributed object system, inter-object communications can be represented in the form: destination.message.sub.-- name(). For example, a programmer can specify that the object C1 issue a message to the object A1 using the syntax: "A/A1.foo()", where "foo()" denotes the message ("foo" being the message name and "()" the arguments) and "A/A1" is the message destination (object A1 in process A). Note that in a typical distributed object system the programmer would not actually need to write the destination as "A/A1"; however, for the purposes of the present application this syntax is used to highlight the process and object to which a message is being sent.
Most distributed object systems have evolved to allow transparent message passing. Allowing distributed objects to communicate in a transparent fashion means that a distributed object system must support intra-process, inter-process and inter-machine communications between objects in a way that is transparent to the user, programmer or objects. i.e., transparency means that an object need not be strictly aware of other objects' locations when issuing messages. For example, if the distributed object system of FIG. 1 supported transparent messaging, the objects C1 and A2 could issue the message "foo()" to the object A1 using the same syntax: A1.foo().
However, even in transparent distributed object systems, there are significant implementation differences between intra-process, inter-process and inter-machine communications that must be addressed. Intra-process communications are faster and more reliable then the other types of communications, consisting of the simple passing of local pointers. Inter-process messaging is also fast, since it occurs within a machine, but additionally requires that object addresses be translated between different processes. Inter-machine communications are much slower and less reliable than the other two types of messaging due to latency associated with issuing messages across an external communications channel and the relatively greater likelihood of channel failure. All of these issues are dealt with in one way or another by the prior art.
Object security is a significant issue raised by distributed object systems. Security problems arise due to the potential lack of trustworthiness of objects, processes and machines. For example, referring to FIG. 1, assume that the objects A1 and A2 are, respectively, a very powerful object and a misbehaving object. If A2 were somehow given access to the full power (i.e., methods) of A1, then A2 could disrupt process A using the full power of A1's methods. Similar security problems can arise between processes (e.g., when access to a process is given to an object in an untrustworthy process) or between machines (e.g., where a misbehaving machine issues unauthorized messages to an object running on another machine). Many distributed object systems have not attempted to deal with these security issues; other systems have provided incomplete solutions that deal with only a subset of the above-mentioned object, process and machine trust issues. However, the prior art includes one technique, called capability security, that addresses most of these problems, albeit only locally (i.e., within a process). Thus, there is a need to extend the ideas of capability security to distributed systems.
The basic tenet of capability security is that the right to do something to an object (i.e., invoke a particular object's methods) is represented solely by the holding of a reference to the particular object. To prevent the unauthorized exercise of rights by misbehaving objects, capability security only allows an object to acquire the capability (i.e., object reference) to access a particular object in one of the following ways:
(1) by receiving the capability from an object that already holds that right (through a message or during creation); and PA1 (2) by being the object that created the particular object.
Thus, referring again to FIG. 1, in an object system that implements capability security, the object A1 could not pass to the object A3 a reference to the object D1 as A1 does not have that capability (in FIG. 1, a directed arrow represents the right to access the object at the end of the arrow).
Traditionally, capability security has been implemented using front end objects, as shown in FIG. 2. In this figure, the object A1 is a very powerful object whose respective methods (not shown) are accessed through the messages msg1, msg2, and msg3. The objects A2, A3 and A4 are less powerful front-end objects that only respond to a subset of the messages supported by A1. For example, the object A2 only responds to msg1. This means that, even though the object A2 can access the object A1, it only exercises the limited set of A1's powers corresponding to msg1. Therefore, by exporting references to different subsets of the front end objects, different capability subsets with respect to the object A1 can be created. For example, referring to FIG. 2, the Requestor only has the capability (an object has a capability if it (1) has a right and (2) knows the ID/location of the object for which it possesses the right) to access the object A2, which means that it can only invoke the methods of the object A1 that are triggered by msg1. The Requestor could, by receiving a reference to the object A3, also acquire the additional capability to cause msg2 to be issued to the object A1. Of course, the presence of a capability security system ensures that rights can only be passed by authorized objects.
The traditional formulation of capability security does not make explicit all of the security problems that can arise in distributed object systems. Because capability security was not designed with distributed systems in mind its inventors did not include in their formulation techniques for solving the problems of distributed systems that are due to possibly misbehaving processes and remote objects and insecure communications channels. The traditional definition of capability security does not explicitly forbid an imposter from interfering with the normal process of message reception and/or decoding so as to stand in for one of the capabilities transmitted. For example, referring again to FIG. 1, if the object C1 passed a reference for the object B1 to the object A1, but the object D1 interfered with the reception and/or decoding of the message by A1, A1 might then come to hold a capability to D1, thinking that it got this capability from C1. A1 might then send messages to D1 that it intended to send only to the object referred to it by C1. Therefore, there is a need for an extended definition of capability security for distributed object systems to indicate that the capabilities that a recipient receives correspond exactly to those that the sender sent.
Another possible problem arising in a distributed object system that can exist under the traditional definition of capability security is what we shall call "confusing the deputy". Referring again to FIG. 1, this is the situation where a first object (e.g., C1) that does not hold a reference to a second object (e.g, A3) tries to pass a message to a third object to which it does have access (e.g, A1), where the third object itself has access to the second object (A3), that would fool the third object A1 into believing that the first object (C1) does have access to the third object. By doing this, there is a possibility that the first object (C1) could induce the third object A1 to issue messages to the second object which the first object C1 itself could not issue. Therefore, there is a need for an extended definition of capability security adaptable to distributed object systems that prevents the problem of confusing the deputy (in the preceding example, A1 is the deputy).
The above two problems point out loopholes in the traditional definition of capability security. Most implemented non-distributed capability security systems do not actually have these loopholes. However, these loopholes are more likely to be present in implementations of distributed capability security systems, which is why there is a need for a revised formulation of capability security when engineering distributed capability systems.
As the objects composing a distributed, object-oriented application, or system, can be distributed over multiple networked computers, system operations can be interrupted by communications or machine failures. For example, in the distributed object system of FIG. 1, just after the object C1 has sent an object reference message to the object A1, machine M2 or machine M1 could go down, or a network partition (failure) could occur. In the case of a machine failure, the state of the objects running on the failed machine and the state of the executing program that includes those objects, could be lost. The potential loss of object and program state due to machine failure is disruptive to distributed applications and is referred to as a lack of persistence (note: persistent systems retain certain state information so that processing can resume after a system crash).
Some highly specialized prior art systems, such as distributed, fault tolerant computer systems, provide object persistence through fast, non-volatile memory, hardware redundancy and checkpointing. Briefly, in such systems, computations are performed redundantly on two machines. At critical points during the program's execution, the state of relevant program components (such as the stacks and all active objects) is checkpointed, or written to the non-volatile memory in each machine. Upon the failure of one of the redundant machines, the other machine continues executing the application, all the time checkpointing its state so that it can update the failed machine (i.e., its stack and object states) from the latest checkpoint whenever the failed machine is revived.
However, given the explosion of network and distributed computing, there are many distributed, object-oriented applications that are executed on generally available hardware (e.g., workstations and personal computers) not specifically designed to provide redundancy and object persistence. These systems lack the fault tolerant systems' fast, non-volatile memory and built-in redundancy. In fact, in many generally available machines, the only non-volatile memory is a slow, hard disk drive. As a result, it is not possible using generally available hardware to checkpoint the data at the rates needed to achieve transparent revival of failed machines in the absence of on-line redundant machines. Therefore, there is a need for a system that provides persistence for distributed object-oriented applications in the context of networks of general, non-redundant computers.
The other possible disruption to distributed applications is a network partition. When a network partition occurs, issued messages may or may not be received, meaning that, when network communications are re-established between two processes that were formerly in communication, those processes are likely to have inconsistent states. For example, a process that sent a message just prior to the partition might not receive (due to the partition) an acknowledgment transmitted by the message's recipient indicating that the message was received.
The specialized fault tolerant systems discussed above get around the problem of network partitions through built-in re-synchronization procedures, wherein the checkpointed data is used to establish a consistent state on the redundant machines executing the application that was interrupted.
There are also software-only systems designed to deal with these distributed failure issues in a totally transparent manner, but these are generally expensive, complex to implement, are not designed to function among potentially misbehaving participants (e.g., they operate by synchronizing checkpoint times), and make unrealistic assumptions about the impossibility of certain failures. In particular, all systems designed to provide fully transparent recovery, whether hardware or software, necessarily assume that it never happens that all copies of a committed piece of data are lost.
Because generally available hardware does not provide full redundancy or fast non-volatile memory, the approach of the specialized, fault-tolerant systems is not applicable to distributed, object-oriented applications running on networks of generally available computers. Therefore, there is a need for a system that provides persistent distributed objects in such a way that distributed processes can re-establish a consistent system state after a network partition. Also, there is a need for such a system to be able to accommodate changes in the locations of the formerly communicating processes after the partition is repaired.
Finally, given the importance of security in distributed applications, it is vital that any solution to the aforementioned problems not compromise the inter-process and inter-machine security provided by the underlying distributed system. For example, if the underlying objects and processes implement capability security, the system that provides persistence must uphold capability security.