1. Field of the Invention
This invention relates to computer systems, and more particularly to the persistence of application state in a distributed container environment.
2. Description of the Related Art
Distributed applications are often implemented as part of commercial and non-commercial business solutions for an enterprise. For example, a company may leverage the use of an enterprise application that includes various databases distributed across multiple computers. Applications of this type, which support E-commerce, typically support hundreds or thousands of sessions simultaneously during periods of peak utilization. For scalability and fault tolerance, the servers running such applications may be clustered.
In some application server cluster implementations state and/or session information for an application running on a server may only be stored locally to that server and therefore, may not be available to any other members of the cluster. Load balancing, in such a clustered system, may amount to nothing more than the round-robin assignment of new sessions to cluster members. Once a particular session is assigned to a given server, all requests associated with that session must be forwarded to assigned server who has sole access to the state data for that session. If the sessions assigned to one server of the cluster generate significantly more requests than the sessions assigned to another member of the cluster, then the actual workload of the two nodes may be disparate and the goal of load balancing within the cluster may not be achieved.
Storing application state data only locally to a given server within a cluster may also have implications in the area of failure recovery. If a server in a cluster fails, state information for the applications running in that server may be lost. Another server in the cluster may be able to take the place of the failed server within the cluster configuration, but may not be able to resume processing the applications from the point where the server failure occurred. For example, client sessions handled by the failed server may have to be restarted from an initial point. If one or more users have spent non-negligible time/effort in promoting their sessions to the state at which the server failed, the need to restart these sessions from scratch may be highly unsatisfactory. One solution to this problem may be to persist application state information to a persistent store that can be accessed by multiple cluster members
Typically, application state persistence is achieved through serialization. Serialization allows an object graph to be serialized into a stream, which can be associated with a file. An instance is serialized by passing it as a parameter to the writeObject method of ObjectOutputStream. The entire graph of objects reachable from the instance in then serialized into the stream. The object graph is later reconstructed by de-serializing the data from an ObjectInputStream.
Serialization lacks features that may be desirable for distributed application systems. For example, there is no support for transactions. Without concurrency control, there is nothing to prevent multiple application component instances from serializing to the same file, thus corrupting state data. Serialization also lacks the ability to perform queries against the data. The granularity of access is an entire object graph, making it impossible to access a single instance or subset of the serialized data. Serialization includes no mechanism to determine when persistence updates should be performed. It therefore falls to the application developer to code the invocation points for serialization. Typically, this is done upon each request and results in large and for the most part unnecessary transfers of data among cluster members. Serialization and the corresponding storing of data can be very time consuming operations.
Also, a single serialization typically, cannot store all the data needed by an application. Applications must manage multiple serializations, either in the same file or in multiple files. Serialization lacks support for identity and the coordinated management of the instances in storage and memory. Therefore, developers must take extreme care to avoid storing and loading redundant instances. If different parts of a large application read the same serialization more than once, multiple copies of this instance will reside in memory. Redundant copies would make the coordination of separate updates extremely difficult. These issues collectively produce a high level of complexity, which often results in a lack of maintainability and can constrain scalability, which is crucial to most enterprise applications.