The present invention relates generally to a computer system for the generating data and, more particularly, to regenerating data on-demand.
Computer systems are often used to generate vast amounts of data. Computer programs often input data and generate output data corresponding to that input data. In complex computer systems, many computer programs may be used to generate data based on data generated from other computer programs. These computer programs for generating the data are referred to as xe2x80x9ctoolsxe2x80x9d or xe2x80x9cservices.xe2x80x9d A description of such complex computer systems will help illustrate the computational inefficiencies that are often encountered. One such complex computer system may be a management information system (xe2x80x9cMISxe2x80x9d) for a large organization. The MIS may collect raw data that is generated at various locations throughout the organization. The MIS may have a variety of report generating tools that input subsets of the raw data and may generate reports. A report itself may be stored as a data set and used as input into another report generating tool. Thus, the reports and the raw data combine to form a hierarchy of data sets. The various reports may be accessible to managers at different levels within the organization. For example, a low-level manager may need access to a detailed report relating to a specific location within the organization, whereas a high-level manager may need a high-level report that summarizes the detailed reports of many locations. When a manager requests that the MIS system generate a report, it may be important that the report is up-to-date. However, it may be very computationally expensive to regenerate all intermediate reports that are used to generate the requested report. It would be desirable to have a technique that would ensure that a requested report is up-to-date, but that would avoid the high computational expense of regenerating all the intermediate reports.
Another such complex computer system is a development environment for computer programs. The development environment allows programs to write, compile, debug, and maintain computer programs. The development environment may use a word processor to generate the source code for the computer program, a parser to generate an intermediate representation of the computer program from the source code, a translator to generate object code from the intermediate representation, an optimizer to generate optimized object code from the object code, and a linker to link optimized object code from different functions into executable code. Large computer programs, such as operating systems, can have thousands of functions which need to be compiled and linked into executable code. The process of compiling and linking such large computer programs can be very computationally intensive. As a result, the compiling and linking of the large computer program can take a very long time and have a significant negative impact on the development of the computer program. For example, if a new executable code is needed because of changes to the source code of the computer program, the source code for each of the functions may need to be parsed, translated, optimized, and linked to generate executable code that is up-to-date. Such generation of executable code may take many hours of computer time, which can significantly slow the development of the computer programs. Some tools for the development system may be able to check the time when an input (e.g., source code) for the tool (e.g., parser) was last written. If all the input that is used to generate an output (e.g., intermediate representation) was not written since the output was last written, then the tool does not need to regenerate the output because it is already up-to-date. If, however, one of the inputs was written after the output was last written, then output may be out-of-date and the tool needs to regenerate the output. Although such tools help to reduce the time needed to generate the executable code, it would be desirable to further reduce that time.
Embodiments of the present invention provide a replay method and system for monitoring the generating of a data set from input data sets and, when the data set is subsequently accessed, automatically regenerating the data set if the data set is out-of-date. The replay system only regenerates those input data sets that are determined to be out-of-date and only regenerates the output data set if it is determined to be out-of-date. A data set is determined to be out-of-date only when an input data set has actually changed since the data set was last generated.
FIGS. 1A-1C illustrate the replaying of a session.
FIG. 2 illustrates a computer system on which the replay system may be executed.
FIG. 3 is a flow diagram illustrating an example implementation of a service.
FIG. 4 is a flow diagram illustrating an example implementation of the needs13 replay routine of the replay system.
FIG. 5 is a flow diagram of an example implementation of the replay routine.
FIG. 6 illustrates a program library that the development environment uses to store information about a computer program.
FIG. 7 illustrates the data set object and replay object organization for the program library of FIG. 6.
FIG. 8 is a flow diagram of an example implementation of a routine of a service to translate functions within a module of a program library.
FIG. 9 is a flow diagram of an example implementation of a routine to create a translation session.