1. Field of the Invention
The present invention relates to distributed computer processing systems, and more particularly, to a clustering model for plural computing units utilizing a virtual shared memory to provide real-time responsiveness and continuous availability.
2. Description of Related Art
With the constantly increasing complexity of scientific, engineering and commercial applications, there is a high demand for systems providing large amounts of computing power. For many such applications, mainframe computer systems represent a traditional solution in view of their ability to perform enormous numbers of computations at very high speeds. Such mainframe computers have significant drawbacks, chiefly being their high cost due in part to their use of highly customized hardware and software developed specifically for each particular application. Moreover, mainframe computers cannot be easily scaled to provide additional capacity as demand increases. An additional drawback of mainframe computers is that they represent a single point of failure. It is necessary to provide redundant computer systems for applications demanding a high degree of system availability, such as telecommunications applications, thereby further increasing the cost and complexity of such systems.
As an alternative to mainframe computer systems, distributed computing systems have been developed in which a plurality of computing units (e.g., personal computers or workstations) are connected to a client-server network. In a distributed computing system, the computational power of the overall system is derived from the aggregation of separate computing units. The primary advantages of such distributed systems are reduced cost and scalability, since each computing unit may be provided using standard commercial hardware and software, and the computing system may be expanded as necessary by simply adding more computing units to the network. A drawback of distributed computing systems is that it is difficult to develop software applications that can coordinate the disparate processes performed on the separate computing units. These processes include the sharing of data between the computing units, the creation of multiple execution units, the scheduling of processes, and the synchronization of the processes. Another drawback of distributed computing systems is providing fault tolerance. When the computing units are executing long-running parallel applications, the probability of a failure increases as execution time or the number of computing units increases, and the crash of a single computing unit may cause the entire execution to fail.
Various fault-tolerant parallel programming models have been developed to address these and other drawbacks of distributed computing systems. One such model is Linda, a parallel computation model based on a virtual shared memory. In Linda, processes in an application cooperate by communicating through the shared memory, referred to as xe2x80x9ctuple space.xe2x80x9d Each xe2x80x9ctuplexe2x80x9d within the tuple space contains a sequence of typed data elements that may take any of various forms, including integers, floats, characters, arrays of data elements, and the like. Processes access tuple space using four basic operations, including: xe2x80x9coutxe2x80x9d for tuple creation; xe2x80x9cevalxe2x80x9d for process creation; xe2x80x9cinxe2x80x9d for destructive retrieval; and xe2x80x9crdxe2x80x9d for non-destructive retrieval. An advantage of Linda is that communication and synchronization via the tuple space are anonymous in the sense that processes do not have to identify each other for interaction. A variant of Linda, known as Persistent Linda or PLinda, supports fault tolerance and is applicable for using idle computing units for parallel computation. PLinda adds a set of extensions to the basic Linda operations that provides fault tolerance by periodically checkpointing (i.e., saving) the tuple space to non-volatile memory (i.e., disk storage). This way, the tuple space can be restored in the event of a catastrophic system failure.
While such fault-tolerant parallel programming models using virtual shared memory are advantageous for solving certain types of mathematical and/or scientific problems, they are impractical for many other real-time applications. Specifically, certain applications require a high level of computation accuracy, such as analysis of high energy physics data or calculation of pricing for financial instruments. For these applications, a lower level of system availability to accommodate periodic-maintenance, upgrades and/or system failures is an acceptable trade-off as long as the computation results are accurate. The Linda or PLinda programming model is well suited for these applications. On the other hand, certain real-time applications require a high level of system availability and can therefore accept a somewhat lower level of computation accuracy. For example, it is acceptable for a telecommunications server to occasionally drop a data packet as long as the overall system remains available close to 100% of the time. Such highly demanding availability requirements allow only a very limited amount of system downtime (e.g., less than three minutes per year). As a result, it is very difficult to schedule maintenance and/or system upgrades, and any sort of global system failure would be entirely unacceptable.
Accordingly, a critical need exists for a distributed computing system having a fault-tolerant parallel-programming model that provides real-time responsiveness and continuous availability.
The present invention is directed to a distributed computing system that provides real-time responsiveness and continuous availability while overcoming the various deficiencies of the prior art.
An embodiment of the distributed computing system comprises a primary server having a primary virtual shared memory and a back-up server having a back-up virtual shared memory. The primary server periodically provides a state table to the back-up server in order to synchronize the virtual shared memory and the back-up virtual shared memory. A plurality of client computer resources are coupled to the primary server and the back-up server through a network architecture. The client computer resources further comprise plural worker processes each adapted to independently perform an operation on a data object disposed within the primary virtual shared memory without a predetermined assignment between the worker process and the data object. Upon unavailability of the primary server, the worker process performs the operation on the corresponding data object in the back-up virtual shared memory within the back-up server. The client computer resources further comprise plural input/output (I/O) ports adapted to receive incoming data packets and transmit outgoing data packets.
There are plural types of worker processes, and each worker process may be adapted to perform a distinct type of function. One type of worker process further comprises an input worker process adapted to retrieve an incoming data packet from an I/O port and place a corresponding data object on the primary virtual shared memory. Another type of worker process further comprises an output worker process adapted to remove a data object from the primary virtual shared memory and deliver a data packet to an I/O port. The remaining worker processes operate by grabbing a data object having a predefined pattern from the said primary virtual shared memory, processing the data object in accordance with a predefined function, and returning a modified data object to the primary virtual shared memory.