In distributed simulation, various techniques are used to describe, model and simulate a collection of physical subsystems and objects that comprise a physical system. A critical aspect of that simulation is understanding and representing the time-phased interaction of individual physical subsystems on each other within the physical system. Changes in the states of the physical subsystems result from these time-phased interactions. In distributed simulation, logical processes (LPs) are commonly used to characterize or mimic the time-phased interaction that occurs among physical subsystems in the physical system. This includes simulation by the LP of the output states of the associated or corresponding physical subsystems.
There are two general types of distributed simulation. One is typically called time-stepped, the other is typically called event-driven. In event-driven distributed simulation, the state of the physical subsystem results from interaction only at some special instant or event, and the happening of those events can only be determined through the simulation itself. Thus, the drive of the simulation is by means of these events, hence, the simulation being termed event-driven. The present invention is most concerned with the later type of event-driven simulation.
Typically, a sequential event-driven simulation employs a queue of events, in which events are arranged in sequential order of the time at which they must happen. At simulation time T, the simulator iterates on the next unprocessed event with the lowest or earliest time stamp. At each iteration, the simulator picks the earliest event in the queue that is schedule for the corresponding LP and the LP executes the event, thereby updating the state of the LP based on the event. As a result of these event executions and associated state changes by all LPs, new events are created which are inserted in the queue according to the time they must be executed.
As previously discussed, to accurately simulate the physical subsystem, events must be processed in a sequential order of time. The difficulty is that with a distributed simulation it is easily conceivable that an LP receives, for inclusion in its queue, an event scheduled for a time that has already passed in the local simulation. This late event has been termed a "straggler" and creates a "causality error." With a causality error, a system will improperly affect the past. This is unacceptable and results in a flawed simulation.
With a distributed simulation of a complex physical system comprising physical subsystems that interact in non-sequential fashion, provision must be made to either avoid causality errors or accommodate and correct the errors. Multiple techniques are known in the art and used to avoid or accommodate stragglers. Conservative simulation, as known in the art, avoids stragglers thus ensuring that the current state will never need to roll back. As similarly known in the art of distributed simulation, optimistic simulation ensures that should a straggler message occur, the proper state is always recoverable.
In conservative simulations, a logical process is only allowed to advance in simulation time when there is no possibility of a straggler message. This constraint slows down the simulation, but there is no possibility of roll back and hence prior state information need not be saved.
One simulation method uses a concept called time warp, based on the Virtual Time paradigm. With time warp, time is somewhat distorted with the simulation advancing based on the interaction of logical processes rather than rigid simulation clock time. In this manner, a simulation advances rapidly when there are few interactions, and more slowly when there are many interactions.
In optimistic simulations, a logical process is allowed to act on events in their event queue until a straggler message is received and identified. On that occurrence, the logical process that received the straggler message typically broadcasts a rollback message to notify other logical processes of the straggler message and sends anti-messages to cancel improperly transmitted messages and stop erroneous computation. When straggler messages occur infrequently, optimistic simulation is very time efficient, allowing the distributed simulation to advance faster than it would under a conservative approach. As known in the field of optimistic simulation with time warp, a single straggler message can cause an avalanche of anti-messages as well as multiple roll backs of each LP. It is further possible, and known in the field, that the system may livelock (LP's keep on rolling back each other and the anti-messages chase the messages in a never ending race).
The techniques known in the art that allow controlled roll back in optimistic simulation to accommodate stragglers, have high message overhead.
It is therefore an advantage and desire to modify the time warp to quickly stop the spread of erroneous computation without requiring output queues and anti-messages. This results in less memory overhead and simple memory management algorithms, with significantly reduced message traffic.