Recent advances in computing technologies have allowed an increasing number of computing devices to communicate, work collectively, and manage processes via a network in order to provide services that may otherwise not be available. However, in this coordinated environment, the service provided by a group of processes located on different devices and working together can fail due to a single point of failure. The failure rate increases as the number of computing devices hosting processes increases, making it increasingly more important to provide a means for fault tolerance and continued group communication.
Fault tolerance can be supported by migrating a process from a failing device to a stable device, in order to provide continued service. In addition to migrating the process, the group membership information must be updated to account for the migrated process. Typically, migrating a process requires finishing all pending communications and stopping all group communication while the migration occurs. The amount of time a migration takes depends on various factors. The interruption in group communication may cause problems for a user such as service interruptions. The cost of an interruption caused by migration may be high, particularly in a distributed mobile environment.
Modern computing devices are likely to be mobile devices, which communicate and receive service through local service stations. When a mobile device moves, its local service station may change. Changing the service station requires process migration and updating the group membership information. Current mobile computing is limited when the computing involves communication among multiple processes on multiple computers, where the processes are involved in group communication, due to the interruption in group communication associated with a process migration.
Mobility is an integral part of modern information services. For instance, when a user participates in a conversation using a mobile phone on a train, the user will probably pass several mobile service stations, which will provide the mobile phone with a communication signal. Typically, the transfer of the communication signal from each service station to the next should be transparent to the user. Computing works similarly in that a piece of software can be downloaded automatically when needed, or can move from one computing device or service station to another in order to provide continued service. Another example of group communication requiring the mobility of a computing device is when multiple users participate in an online game concurrently and interactively through multiple computers. If one of the users moves from one service station to another service station, all communications with that particular user may be lost. The user will not be able to continue to participate in the game without a service interruption unless all processes from the first service station are transparently migrated to the second service station and the group membership information is updated.