This invention pertains to methods and apparatus for collecting and restoring the data contents of a process in the memory space of a computer, more particularly for data collection and restoration between computers which may or may not have the same computing platform.
Creation of a national communication infrastructure, the so-called xe2x80x9cinformation superhighway,xe2x80x9d marked the beginning of a new era in computer communication. Network computing has emerged as an essential component of the infrastructure; however, current network environments do not meet rapidly increasing computational demands. Efficient process migration, i.e., transfer of a process between different computers, is one of the critical issues of a national effort to solve this problem.
In distributed network computing, adaptability of process assignment is desirable for high throughput and resource utilization, especially for a long-running application. A xe2x80x9cprocessxe2x80x9d is a piece of a program in execution. It represents a job assigned to a computer during the execution of an application. An application can comprise one or more processes running on single or multiple computers.
The software and hardware on a computer create a distinct computing platform. In the development of a user application, users provide an executable file via a compiler for that particular computer. The executable file contains a sequence of machine instructions in form of platform-specific binary code. One or more processes are created on a computer before an executable file can be executed on a computer, so that the operating system of the computer can load those instructions into the computer""s main memory and assign the instructions to the central processing unit, or CPU.
In building an executable file, a user can write a program (or source code) in the form of a high-level computer language such as C, C++, or FORTRAN, and pass it to a compiler for that language. A user program comprises a global data definition area and a description of functions. Each function description comprises parameter variable declarations, local variable declarations, and programming language statements. The compiler translates the program source code into the platform-specific binary code, and stores them in an executable file. During compilation, the compiler can also optimize the machine instructions according to specific features of the computing platform. At runtime, the operating system loads the executable file into the computer""s memory. The loaded executable file is then ready to be executed by the CPU and is recognized as one or more processes.
Efficient process migration, where the execution of a process is suspended on one computer and then resumed on another computer, is a mechanism to adapt process and resource assignment. xe2x80x9cHeterogeneousxe2x80x9d process migration occurs when a process is transferred between two machines that differ in hardware or software environments such as CPU, memory, compiler, operating system, or software tools. The process can be transferred via direct network-to-network communication (network migration) as opposed to file migration. Applications of process migration include load distribution, migrating processes from overloaded machines to underloaded machines to exploit otherwise unused computing cycles; fault resilience, migrating processes from machines that may experience partial failure; resource sharing, migrating processes to machines with special hardware or other unique resources such as databases or peripherals required for computations; data access locality, migrating processes towards the source of the data; and mobile computing, migrating processes from a host to a mobile computer.
In terms of resource sharing, processes can be migrated to computers that have resources such as databases or peripherals required for computations. In addition to clustered network computing, mobile computing and ubiquitous computing (two emerging computing disciplines) also demand efficient process migration. The advantage of process migration can be significantly scaled up when the underlying computers are heterogeneous, but the complexity of heterogeneous process migration becomes significantly scaled up as well. While software environments have been developed for homogeneous process migration, currently no solution exists for efficient heterogeneous process migration.
Fundamentally, there are three steps to make source code migratable in a heterogeneous environment:
(1) Identify the subset of language features that is migration-safe, i.e. features that theoretically can be carried across a network;
(2) Implement a methodology to transform migration-safe code into a xe2x80x9cmigratablexe2x80x9d format so that it can be migrated at run-time; and
(3) Develop mechanisms to migrate the xe2x80x9cmigratablexe2x80x9d processes reliably and efficiently.
Three different strategies have been used for process migration: Operating System (OS) support, checkpointing, and mobile agent. The traditional OS support approach for process migration is very difficult to implement, and nearly impossible to extend to a heterogeneous environment because the operating system approach is based on the run-time image of platform-specific binary code. Checkpointing has been developed primarily for fault tolerance, by transferring and restarting the checkpointed processes on working machines. Checkpointing requires access to file systems and roll backs to a consistent global state in parallel processes. To checkpoint a process requires the additional step of saving the data contents of a process to a file periodically. Later, when recovery is needed, data from the checkpointed file will be read and restored in a new process to resume execution of the application. Checkpointing, although successful in homogeneous environments, is still difficult in heterogeneous environments because it is too slow to meet the needs of high performance network computing, especially for distributed parallel processing.
The mobile agent approach is an alternative to xe2x80x9ctruexe2x80x9d process migration. Mobile agents are implemented on top of safe languages, such as Java. Interpreted, or safe, languages are more secure and promising for certain applications. The interpreter acts as a virtual machine to create an artificial homogeneous environment. However, these languages are less powerful, slow, and require rewrites of existing software.
In any approach to process migration, there is a need for efficient methods to recognize, collect, and restore data contents of a process. To migrate a process, all data necessary for future execution of the process has to be collected and then restored in the data segment of the new process on another machine.
There are two basic types of data objects that can be contained in the memory space of a process: the storage object and the memory reference object. The storage object (or a memory block) is a piece of memory space that is used to store a value such as a character, an integer, a floating-point number, or a memory address. The memory reference object (or a pointer, an indirect memory reference) is the memory address of a storage object. Accessing the content of a storage object via its pointer is called xe2x80x9cdereferencingxe2x80x9d.
FIG. 1 shows examples of storage objects and reference objects. A storage object comprises a memory address and a memory storage space. In FIG. 1, the storage object at memory address 3010 contains an integer value, 10. Likewise, the storage objects at addresses 2001 and 1120 contain a floating-point value, 9.99999, and a character, xe2x80x98Axe2x80x99, respectively.
A memory reference object is a memory address of a storage object. A memory reference object is also called a pointer. A pointer can be used to access a storage object and its contents. For instance, the pointer value 2001 is a reference to the storage object containing the value 9.99999. Dereferencing the pointer 2001 gives a floating-point value, 9.99999, as a result. At the address 0109, the storage object contain a memory address 1120 as its value. The pointer 1120 is a reference to the storage object at memory address 1120.
Due to differences in hardware and software in a heterogeneous environment, the detailed representation of a data object within a process on different platforms could be different in two aspects:
[1] Data Representation: Each platform has a machine-specific data representation. For example, machines with processors such as System/370 and SPARC use xe2x80x9cbig-endianxe2x80x9d byte ordering, while others with processors such as VAX and Intel 8086 use xe2x80x9clittle-endianxe2x80x9d byte ordering. Floating-point representations also differ on various platforms.
[2] Memory Address: Due to different OS memory management and loading operations, a memory address used in a process on one computer may be meaningless to a process on a different computer. When the executable file is loaded into the computer""s main memory, its contents are placed at particular memory locations, depending on the memory management scheme used by the operating system of that computer. Therefore, while a memory address in one process could refer to a particular data object, the same memory address could be undefined or refer to something else when used by a process on another machine.
Since the data content of a process can contain complex data structures that are combinations of both storage objects and pointers, a mechanism to collect the data content of a process must recognize different data objects as well as their structures. Further, the data content must be transferred into a specific machine-independent format. To restore the data content to the memory space on a different machine, the restoration mechanism must be able to extract the collected information and reconstruct the data structure into the memory space.
Until now, there has been no satisfactory solution to the problem of data collection and restoration because: 1) processes have to restart under different hardware and software environments, defined as xe2x80x9cheterogeneityxe2x80x9d; 2) only necessary data should be transferred over the network to reduce migration cost to a tolerable level, defined as xe2x80x9cefficiencyxe2x80x9d; and 3) complex data structures such as pointers and recursive calls have to be analyzed and handled appropriately to support general high-level languages, such as C and Fortran, defined as xe2x80x9ccomplexity.xe2x80x9d
As a national effort, several prototype next generation distributed environments are under development. See S. Grimshaw, W. A. Wulf, and the Legion team, xe2x80x9cThe Legion vision of a worldwide virtual computer,xe2x80x9d Communications ACM, vol. 40, no. 1, pp. 39-45, 1997; R. Stevens, P. Woodward, T. DeFanti, and C. Catlett, xe2x80x9cFrom the I-WAY to the national technology grid,xe2x80x9d Communications ACM, vol. 11, no. 40, pp. 51-60, 1997. These systems are designed to build a world wide virtual machine environment on top of the huge number of computers available on a network. Network process migration is one of the critical issues to the success of these systems.
Due to their complexity, early works on heterogeneous network process migration concentrated on theoretical foundations. See M. H. Theimer and B. Hayes, xe2x80x9cHeterogeneous process migration by recompilation,xe2x80x9d Proceeding of the 11th IEEE International Conference on Distributed Computing Systems, pp. 18-25, June 1991; and von Bank, Shub, and Sebesta, xe2x80x9cA unified model of pointwise equivalence of procedural computations,xe2x80x9d ACM Transactions on Programming Languages and Systems, vol. 16, November 1994. No prototype design or experimental implementation was provided.
Theimer et al. approached heterogeneous process migration by 1) dynamically constructing a machine-independent program of the state of the computation at migration time, 2) compiling the program on the destination machine, and 3) running the program on the destination machine to recreate the migrated process. The transformation uses xe2x80x9cnative code,xe2x80x9d intermediate language such as that used by compiler front end and backend code generators, to simplify process state specification. The migration overhead costs are increased by the recompilation requirement. The report specifically does not provide a solution for high level languages where, for example, code explosion may occur. The principles of source-level debugging are used to reverse-compile the machine-dependent binary program state to a machine-independent source program description, with the data traced in a xe2x80x9cgarbage collectorxe2x80x9d fashion using a source-level debugger procedural interface. The theory proposes modified versions of the procedure on the stack to recreate the stack data and then resume execution of the original procedure.
Von Bank et al. proposed a modified compiler to generate code for all expected machines along with sets of native code, initialized data, and correspondence information embedded in the executable file at program translation time rather than dynamic generation during migration to minimize migration time. A procedural computation model defines the points where transformation is practical, by defining equivalence parameters for two well-defined compatible states where the process can be transformed without loss of information, but does not propose how this transformation is accomplished. Determining the compatible well-defined state of computation in the address space of the process is composed of 1) the function call graph, 2) the values of variables in the static data, 3) a dynamic data pool and 3) the activation data pool, which provides points of equivalence for migration. The report proposes only a theoretical migration model that includes a function call graph in which the vertices are functions and the edges of the graph are function calls, defined as any subroutine. A function call graph vertex contains a flow graph for the function and a local data template to activate the function. The function call graph edge contains a parameter data template and a dynamic data template describing the type of dynamic data to be allocated. The reference does not propose how to collect the data from the memory space.
Casas, et al., xe2x80x9cMPVM: A Migration Transparent Version of PVM,xe2x80x9d Department of Science and Engineering, Oregon Graduate Institute of Science and Technology, February 1995, reports a technique for migrating processes in a parallel distributed environment. However, the method is not heterogeneous, requiring the same machines (hardware) to support the migration. The data, stack and heap segments of the execution code are transferred to the other machine, requiring binary compatibility.
K. Chanchio and X.-H. Sun, xe2x80x9cMpPVM: A software system for non-dedicated heterogeneous computing,xe2x80x9d Proceeding of 1996 International Conference on Parallel Processing, August 1996, and K. Chanchio and X.-H. Sun, xe2x80x9cEfficient process migration for parallel processing on non-dedicated network of workstations,xe2x80x9d Tech. Rep. 96-74, NASA Langley Research Center, ICASE, 1996, describe certain procedures and data structures for transforming a high-level program into a migratable format via a precompiler to avoid compiler modification. A set of migration points is inserted at various locations in the source programs based on migration point analysis. A migration point is a point of execution that allows a process to migrate to a new host. The migration-point concept is based on the checkpointing approach for efficient process migration. Unlike checkpointing, in the migration-point approach the program state does not need to be stored periodically. Upon migration, the migrating process continues execution until the first migration point is met. The process is migrated via direct network-to-network communication. Special variables and macros are inserted at migration points and related locations in the program to control transfer of the program execution state as well as data from a migrating process to a new process on the destination machine. The cooperation of the macros that manage data collection, transmission, and restoration is called the data transfer mechanism. The data stack stores all necessary local and global data at the point where migration occurs. In the stack data transfer (SDT) mechanism, live data of the executing function is collected first, and that of the caller function is collected later. Upon restoration, the SDT restores live data of those functions in reverse order. The SDT mechanism precludes overlapping during data collection and restoration. xe2x80x9cNecessary data analysisxe2x80x9d is proposed as a methodology to reduce the size of data that must be transferred during the migration. xe2x80x9cPre-initializationxe2x80x9d is also proposed to send modified source code to all machines anticipated to be destination machines and compiled on those target machines, preferably before migration needed to reduce migration overhead. No general and efficient method for data collection was given.
The effort to transmit data among processes in a heterogeneous environment is not new. Well-known software packages such as Sun""s XDR have been used to support data transmission among computers with different data representations. See J. Corbin, The Art of Distributed Applications, Springer-Verlag, 1990. The XDR software package consists of XDR data representation, standard definitions of data in machine-independent format, and an XDR library that provides routines to encode data stored in the native format of a machine to the XDR format, and to decode data in the XDR format to the native one. The XDR software package does not provide a mechanism for data collection and restoration in process migration.
Recent works in data collection and restoration mechanisms for process migration have addressed two major directions: the employment of specially modified debugging utilities and the annotation of special operations to the source code. In the first direction, Smith and Hutchinson have investigated the migration features of high-level languages such as C, and have developed a prototype process migration system called TUI. See P. Smith and N. C. Hutchinson, xe2x80x9cHeterogeneous Process Migration: The TUI system,xe2x80x9d Tech. Rep. 96-04, University of British Columbia, Department of Computer Science, February 1996, Revised on March 1997. Smith and Hutchinson identified the migration-unsafe features of the C language and used a compiler to detect and avoid most of the migration-unsafe features. In their design, process migration is controlled by the external agents, migrout and migin, for data collection and restoration, respectively. The TUI system features a compiler-generated state mapping information in a symbol table similar to those typically used in symbolic debuggers. The external agents require special programs to capture and restore the state of the running process. The steps to migrate include 1) compiling the program once for each architecture using a modified compiler; 2) checkpointing the process using migrout to fetch memory and data values; and 3) creating an intermediate form. The data type is known by compiling with the modified compiler. Debugging information is used to scan and locate data to copy into TUI""s address space using a xe2x80x9cgarbage collectorxe2x80x9d technique. A value table is maintained to collect the data only once, with the memory scanned in linear fashion. Each entry in the value table is assigned a unique number with pointers recorded in the table until data values are set on the destination machine. Subsequent prototypes of TUI imposed increasing/decreasing order to scan addresses, a restriction that may cause problems for new architectures. Smith and Hutchinson""s work has several design aspects consistent with von Bank""s foundation for viewing data elements in program memory space from the perspective of available debugging technology. The compiler must be modified to provide debugging information and to insert preemption points and call points into the executable code for capturing and restoring process states. The need to modify the front-end and back-end of the compiler may limit portability to various computer platforms, since the compiler must be modified for each architecture in the environment. Also, a modified compiler may not be able to fully exploit the machine-specific optimization of a native compiler.
The second direction uses the xe2x80x9cprogram annotationxe2x80x9d technique to support process migration. The Process Introspection (PI) approach proposed by Ferrari, Chapin, and Grimshaw uses this technique. See J. Ferrari, S. J. Chapin, and A. S. Grimshaw, xe2x80x9cProcess Introspection: A Heterogeneous Checkpoint/Restart Mechanism Based on Automatic Code Modification,xe2x80x9d Tech. Rep. CS-97-05, University of Virginia, Department of Computer Science, March 1997. The PI approach has some similarities to MpPVM migration point analysis approach. Process introspection implements a prototype of library routines (PIL library) to support the design. The process state is captured in a data only format that must be used in conjunction with a separate executable file. Points representing checkpointing locations are inserted in the source code before compilation. Experiments were conducted on a number of array-based numerical kernels. To collect data, the Pi approach uses native subroutines to save stack data to checkpoint the stack. The active subroutine saves its own data (which only it can access), then returns to its caller, which in turn saves its own stack, and so on, until the stack capture is complete, and reverses the procedure to restore the data. An added subroutine outputs dynamically allocated stack elements to a table in the PIL library. The xe2x80x9cTypexe2x80x9d table in the PIL library contains descriptions of basic data types stored in the memory block, and maps type identifiers to logical type descriptions as a linear vector of some number of elements of a type described by an entry in the type table. However, the PI approach does not define the logical type description or how they are used to collect and restore data. The xe2x80x9cData Format Conversionxe2x80x9d module in the PIL masks differences in byte ordering and floating point representation and contains routines to translate basic data types to and from available formats. The xe2x80x9cPointer Analysisxe2x80x9d module generates logical descriptions of memory locations with a unique identification number and an offset into the memory block. The methodology does not provide how the number is assigned or the ordering used to assign the numbers, or how the offset is defined in a heterogeneous environment, or how to handle a situation in which multiple pointers reference a single object.
U.S. Pat. No. 5,666,553 discloses a method for translation of dissimilarly-formatted data between disparate computer systems and provides for dynamic reconciliation of conflicts in the data based on content and by the user. The disclosure is directed toward creating a common data format between desktop and handheld computer database applications to identify and resolve conflicts between the applications and update them.
U.S. Pat. No. 5,126,932 discloses a method to execute a program consisting of data and multiple successive operations on the data in a heterogeneous multiple computer system with autonomous nodes that each have a processor and associated memory. A control arrangement initiates execution of the program on the first autonomous node while a coupling arrangement transfers execution of the program to a second autonomous node in the multiple computer system in response to successive operations in the program.
We have discovered a technique for collecting memory contents of a process on one computer into a machine-independent information stream, and for restoring the data content from the information stream to the memory space of a new process on a different computer. The mechanisms and associated algorithms of data collection and restoration enable sophisticated data structures such as pointers to be migrated appropriately in a heterogeneous environment. These mechanisms analyze the pre-stored or current program state for heterogeneous process migration and can be used in both checkpointing and migration-point process migration, as well as in sequential and parallel distributed computing. These mechanisms may be used in any general solution to network process migration to carry out the following tasks automatically and effectively:
(1) Recognize the complex data structures of a migrating process for heterogeneous process migration;
(2) Encode the data structures into a machine-independent format;
(3) Transmit the encoded information stream to a new process on the destination machine; and
(4) Decode the transmitted information stream and rebuild the data structures in the memory space of the new procession the destination machine.
While the prototype algorithms and software to date have been written in C code, the mechanisms are general and may be used to support applications written in any stack-based programming languages with pointers and dynamic data structures, for example, C++, Pascal, Fortran, and Ada. A prototype run-time library has been developed to support process migration of migration-safe C code in a heterogeneous environment.
The run-time library developed by the inventors has been successfully tested under the buffer data transfer (xe2x80x9cBDTxe2x80x9d) mechanism for heterogeneous network process migration. The BDT mechanism implements data collection and restoration mechanisms differently from stack data transfer (xe2x80x9cSDTxe2x80x9d) by allowing data collection and restoration operations to be overlapped to increase efficiency. The BDT mechanism manages data collection and restoration through the following steps:
(1) When migration is initiated, the BDT sends information about the execution state of the migrating process to a new process on the destination machine. The new process creates a sequence of function calls identical to those in the migrating process to jump to the point where migration initiated.
(2) After sending the information, the migrating process collects and saves the necessary data of the innermost function in the calling sequence to a buffer and returns to its caller function. The same operation continues until the main function is reached. Before terminating the migrating process, the BDT sends the stream of information stored in the buffer to the new process.
(3) At the destination machine, the new process reads and restores the live data from the information stream. The BDT mechanism restores the live data of the function called until the end of its execution. After the function returns to its caller, the BDT again reads the content of the information stream to restore live data of the caller function, and continues to control the order of the data restoration through this process until the main function is reached.
The BDT mechanism improves performance by allowing simultaneous saving and restoring operations: the new process restores its memory space while the migrating process saves the next portion of data from its memory space. The algorithms and the run-time library of the present invention are, however, independent of the Buffer Data Transfer mechanism. They can be used under any process migration environment which provides an appropriate interface.
We have used an annotation technique in a novel data collection and restoration technique that views data elements of a program from the viewpoint of abstracted programming. Codes annotated to the source program systematically track program data structures in the form of a graph. Mapping graph notations to represent data structures of the process gives high-level representation of the relationship among data elements. The graph model is very useful because it can be analyzed and manipulated in many ways. A rich literature of graph theory and algorithms are available. The graph representation makes the novel approach error-free and efficient. The data collection and restoration methods may be used for both migration-point and checkpoint-based heterogeneous process migration. The novel technique is designed specifically for heterogeneous process migration, and is considerably faster and more effective than the debugger-based approach.
A preferred embodiment comprises four modules: a Type Information (TI) table, saving and restoring functions, the Memory Space Representation Look Up Table (MSRLT) data structures and programming interfaces, and the data collection and restoration algorithms and their programming interfaces. These modules, illustrated schematically in FIG. 2, are attached to the source process and destination process for data transmission. On the source computer, these modules work together to collect data from the memory space of the source process, and to put data into a machine-independent information stream output. On the destination computer, software modules attached to the destination process extract data from the machine-independent information stream, and place data in appropriate locations in the memory space of the destination process.
A key feature of the invention is a model that represents the process"" memory space. The Memory Space Representation (MSR) model gives a high-level, platform-independent viewpoint of data contents in the memory space of a process.
The data contents of a process are a snapshot of the process memory space. In the MSR model, the memory snapshot may be viewed logically as a graph in which each node of the graph represents a memory block allocated for storing data during program execution (or a storage object), and each edge of the graph represents a pointer that references an allocated memory block (an MSR node) to any other memory blocks, including itself.
For example, FIG. 3 shows the contents of a memory snapshot of a process. The memory blocks at addresses 0007 and 0020 contains integer values 10 and 20, respectively. The other memory blocks in FIG. 3 are combinations of an integer value and two pointers. For instance, the memory block at address 0101 contains an integer 30, and pointers 0207 and 2026. The first pointer, 0207, refers to another memory block containing the integer value 40, and pointers 0899 and 2026. The memory blocks at addresses 0899 and 2026 are the same type of memory blocks as those at addresses 0101 and 0207, but their pointer components are NULL.
In the MSR model, the memory snapshot in FIG. 3 can be represented by the MSR graph in FIG. 4. In the MSR graph, a node is the representation of a memory block in the memory snapshot of a process. For example, nodes V1 and V3 represent the memory blocks at addresses 0007 and 0101, respectively. At node V3, the pointer components create two MSR edges, E1 and E2.
In an MSR graph, an edge is a direct link between two nodes. The source of the link is a component of a memory block that stores a pointer, and the destination of the link is the component of a memory block to which the pointer refers. An edge can also be defined as the pair of addresses of its source and destination components.
The head address of a memory block is the starting address of the memory block. The head address can be used to designate the memory block since it is a unique property. Further, a memory block can comprise multiple objects. Each object is considered a component of the memory block. A component address is the memory address at the starting address of a component. To say that a memory address x is the address of an MSR node y means that the address x is one of the component addresses of the memory block represented by node y.
Further, the length of a memory block is the capacity (in bytes) of the contiguous memory space allocated for storing data. Suppose that it takes 4 bytes to store an integer and 4 bytes to store a pointer. The lengths of nodes V1 and V3 are then 4 and 12 bytes, respectively. The number of elements of an MSR node is the number of data objects contained in the memory block. For example, the number of elements in each of nodes V1 and V3 is one, since each contains a single data object of type integer and TREE, respectively.
For example, if the size of an integer value and the size of a pointer are each 4 bytes, the component addresses of the node V3 are 0101, 0105, and 0109. The component address 0101 is the memory address of the first component of V3, which stores the integer value 30. The second component address 0105 (0101+4) is the memory address of the second component of V3, which stores a pointer to the memory block at address 0207. Finally, the address 0109 (0105+4)is the component address of the last component of V3, which contains a pointer to address 2026.
Table 1 lists information defining the MSR edge in FIG. 3. For example, edge E1 can be defined as (0105, 0207) where address 0105 is the address of the second component of the source node V3, and the address 0207 is the address of the first component of the destination node V4.
The Type information (TI) table is created to track properties of each data type used in the process. The data type is the basic property that describes the semantic meaning of the object stored in a memory block. The type of a memory block is determined by the type of its contents (or data). A memory block contains an object or an array of objects of a particular type. The data type could be a xe2x80x9cprimitivexe2x80x9d data type such as integer, character, or floating-point. It could also be a complex data type such as an array, and structure, record, pointer, or combination of these different types.
The array type is an abstraction of data in which a memory block contains multiple data of the same type. The structure type describes the data when a memory block contains data of different types. The pointer type is the type of a memory block that stores indirect memory references to memory blocks, perhaps including itself. In most of the high-level programming languages such as C, a name can be given to a data type. This name can be used to reference a type of a user-defined object.
For example, suppose that TREE is defined by a user as a type of a memory block containing an integer and two pointers to memory blocks of type TREE. In FIG. 4, the type of nodes V1 and V2 is integer; while those of nodes V3, V4, V5, and V6 are TREE.
The TI table is used to store information about every data type, as well as basic functions for data collection and restoration. Other parts of the TI table are the type-specific saving and restoring functions. The saving and restoring functions collect the contents of a memory block node in the MSR graph, and restore the collected information back to a memory block node, respectively. To collect the content of a graph node, which is a memory block, the basic saving function associated with the type of the memory block is invoked. The saving function then generates the machine-independent information of a memory block; the restoring function takes the machine-independent representation of the information and restores it in a machine-specific form to the appropriate memory location. To collect a graph edge, which is a pointer, the algorithm translates the machine-specific pointer address to machine-independent pointer information.
The MSR Look up Table (MSRLT) data structures are constructed to provide machine-independent locations of memory blocks and a mechanism to access them. The machine-independent locations of memory blocks are the indices of the MSRLT data structure to access those memory blocks. Since there could be a large number of memory blocks allocated in the process memory space, a method is provided to select the memory blocks that need to be registered into the MSRLT data structures. This selection keeps the size of the MSRLT data structures small, and reduces the searching time when the MSRLT is used.
According to yet another feature of the invention, algorithms to collect and restore the live data content of the process are provided. Since the data content of the process may be represented in the form of an MSR graph, the data collection algorithm may traverse and collect graph components (nodes and edges) in a depth-first manner, that is, it travels from the xe2x80x9cleavesxe2x80x9d back up to the xe2x80x9crootxe2x80x9d of the graph.
When a depth-first search algorithm is used to collect graph components, it is assumed that an initial set of nodes to be collected is known, based on live variable analysis or other criteria. For example, in FIG. 5, the nodes V3 and V4 are the initial set. This set of nodes should be known by both the sender and receiver processes before the data collection and restoration mechanisms begin. Since the receiver process does not initially know anything about the MSR nodes of the sender process, the information about the MSR nodes of the sender process must be received by the receiver process before data restoration begins.
Assuming that the receiver process knows the information of the MSR graph and the initial set of nodes V3 and V4, FIG. 5 shows an example of the output of data collection on nodes V3 and V4. In collecting V3, recall that V3 has a structure type which is a combination of an integer value and two pointers. First the integer value of V3, which is 30, is collected and saved into the information stream in a machine-independent format. V3 is also xe2x80x9cmarkedxe2x80x9d as having been xe2x80x9cvisitedxe2x80x9d by the data collection algorithms. Then when the algorithm encounters the first pointer of V3, the information of the edge E1 is collected. The depth-first algorithm follows the pointer E1 to collect the content of V4, and marks V4 as having been visited. The integer part of V4 is saved to the machine-independent stream. Then its first pointer, E4, is collected. By following E4, the data collection algorithm next visits the node V6. The information of node V6, including an integer value and two pointers, is saved and marked as visited.
Since there is no other link from V6, the algorithm backtracks to V4, and saves the second pointer of V4, which is E3. By following the link E3, V5 is visited and saved. Since there are no pointers contained in V5, the data collection algorithm backtracks to V4. Since there is no more information to be saved on V4, the algorithm backtracks to V3.
At V3, the second pointer, E2, is saved. By following E2, the node V5 would be saved. But V5 has already been visited. Thus, its data is not collected again. The algorithm instead records information saying that V5, which is the destination node of E2, has been saved (or marked). The data collection algorithm backtracks to the node V3. Since there is nothing left to be saved at V3, the data collection operation on V3 is finished. In collecting data of node V4, since its data was already collected in the previous data collection operation, node V4 will not be saved again. The algorithm instead saves information (V4(MARKED)), indicating that V4 was already saved to the information stream.
The information to reconstruct the MSRLT data structure and the collected machine-independent information of the memory space contents are placed into the output information stream and sent to another process on the destination computer. During data transmission between two processes on different machines, the information about the structure of the MSRLT data structures is sent to the destination process first, before any other machine-independent information. An identical MSRLT data structure is then created from the transmitted information in the memory space of the destination process to provide an index for restoring the data.
At the receiving end, the destination process reads the transmitted data and reconstructs its MSRLT data structure. Then the data restoration algorithm is invoked to extract data from the information stream, and to restore the data to the appropriate memory location in the destination process""s memory space. The data will also be transformed from the machine-independent format to the machine-specific one according to their data types.
For example, on the receiver process, the data restoration operation restores information of V3 and V4 to its memory space. The data of V3 is extracted from the information stream to a memory space allocated for V3. Then the information of E1 and V4 are extracted from the information stream. The information of V4 is put to an allocated memory space. E1 is also recreated as a pointer from the second component of V3 to the first component of V4.
Next the restoration algorithm extracts data of E4 and V6 from the information stream. The data of the node V6 is put in an allocated memory space. The edge E4, which is a pointer from the second component of V4 to the first component of V6, is also reestablished. Since V6 does not have any pointers, the algorithm backtracks to reconstruct the second pointer of V4, which is E3.
Then the algorithm extracts E3 and V5 from the transmitted information stream. The data content of V5 is put to an allocated memory space. The pointer represented by an edge E3 is reestablished from V4 to V5. Since the pointer contents of V5 are NULL, the restoration algorithm backtracks to restore the rest of the content of the node V4. But since all components of V4 have been restored, the algorithm backtracks to restore the second component of node V3.
At this point, the data restoration algorithm extracts information of E2, and the marking flag indicating that V5 has already been visited (V5(MARKED)) from the information stream. By reading the marking flag, the algorithm knows that data content of V5 is already restored. So the algorithm finds the node V5 in the memory space of the receiver process and reconstructs the edge E2, the pointer from the third component of V3 to the first component of V5. Now the data content of the node V3 is completely restored.
Next the algorithm restores V4. Since the information of V4, V4(MARKED), in the transmitted information stream indicates that V4 is already restored, the algorithm finds the node V4 already in the memory space of the receiver process.
The invention has been successfully demonstrated with three C programs containing different data structures and execution behaviors including numerical-intensive and pointer-intensive behaviors with recursion.