1. Field of the Invention
The present invention relates, generally, to a software system for tracing data and, more specifically, to such a system wherein the separation of trace control data from the trace procedure allows the data to be traced by the trace procedure to be stipulated very flexibly and without retranslating the trace procedure.
2. Description of the Prior Art
In a contemporary device for processing information, a multiplicity of information processing operations, also called xe2x80x9cprocesses,xe2x80x9d are normally executed on the process-handling devices provided in the device; e.g., processors. In this context, a process is understood to be the operation of information processing taking place in algorithm form; i.e., in a series of individual information processing steps. Examples of processes are the editing of files or the sending of e-mails. Examples of information processing steps are opening, printing and/or storing a file or changing a memory area in the main memory.
The information processing steps are performed by a processor under the control of programs having a succession of instructions, also called xe2x80x9cmachine commands.xe2x80x9d In this context, different machine commands characterize different fundamental processing steps of the processor. Examples of fundamental processing steps are arithmetic operations; e.g., integer addition or subtraction. The program thus indicates to the processor which of its fundamental processing steps needs to be carried out in which order. To carry out the information processing steps, the programs use different components of a contemporary information processing device, depending on the type of information processing steps. These components are also called xe2x80x9coperating parts.xe2x80x9d Examples of operating parts are printers, processors, memory space or files. To ensure that the operating parts have the best possible utilization level, the operating parts are normally used simultaneously; i.e., reciprocally or jointly. This operating mode of the device is called xe2x80x9cmultiprogram operationxe2x80x9d or xe2x80x9cmultitasking.xe2x80x9d That is, the device appears to process a number of programs simultaneously. In this context, the programs can be processed sequentially, i.e., one after the other, or in parallel, i.e., with their timing intermeshed. The intermeshed processing of a number of programs is also called xe2x80x9cinterleaving.xe2x80x9d
Since, by way of example, the simultaneous use of the operating parts can give rise to conflicts (for example, when two programs wish to output data to the same printer simultaneously or wish to have write access to the same file simultaneously), an operating system is normally provided for managing the operating parts. In this context, an operating system includes those programs which, among other things, control and monitor access to the operating parts, the order of timing for processing the programs running in the device, and maintenance of the operating mode. For the purposes of distinction, these control programs are called xe2x80x9coperating system programsxe2x80x9d or else xe2x80x9csystem software,xe2x80x9d and the programs using the system software to access the operating parts are called xe2x80x9cuser programsxe2x80x9d or else xe2x80x9capplication software.xe2x80x9d
To support this division of the programs into two classes, contemporary processors normally provide two operating modes, a privileged operating mode (also called xe2x80x9csystem modexe2x80x9d), in which all the machine commands can be executed without limitation of their mode of action, and an unprivileged operating mode (also called xe2x80x9cuser modexe2x80x9d), in which a few machine commands are limited in terms of their mode of action or are prohibited. By way of example, in the unprivileged operating mode, those machine commands which are used for direct access to the device""s operating parts are prohibited. Normally, the application software is executed in the user mode, and the system software in the system mode. The application software can thus access the operating parts of the device only using the system software. The user programs thus contain no program parts which, by way of example, stipulate the memory cells of a main memory in which the user programs are stored, how and where the data of the user programs are stored on background stores, e.g., hard disks, the order in which the user programs are executed or how user programs and data are protected against unauthorized access. These and other tasks, e.g., management, control or monitoring tasks, are performed by the operating system. By way of example, the following fundamental information processing steps are performed by the operating system:
reading in a character from a keyboard;
outputting a character to a screen; and
reading a program into a memory.
For carrying out the fundamental information processing steps, different xe2x80x9coperating system proceduresxe2x80x9d are provided in the operating system. In this context, procedures are machine command sequences combined to form units which can be called. These commands are generally called by the application software. In contemporary processors, an operating system procedure is normally called using a machine command which the processor uses to interrupt execution of the calling program and to branch it to the system software. The interruption is also called an xe2x80x9cinterrupt,xe2x80x9d the machine command is also called an xe2x80x9cSVCxe2x80x9d (Supervisor Call), and the called operating system procedure is also called an xe2x80x9cSVC procedure.xe2x80x9d Once the called procedure has been processed completely, execution normally branches to the next machine command of the calling program; i.e., to the machine command which immediately succeeds the machine command SVC in the machine command sequence of the calling program.
In terms of opportunities for use which have the greatest possible degree of flexibility, procedures normally contain variable components, also called xe2x80x9cformal parameters,xe2x80x9d which are usually stipulated by the calling program when the procedure is called, by indicating xe2x80x9ccall parameters.xe2x80x9d In this context, the formal parameters are replaced by the call parameters on the basis of specific rules, which are also called xe2x80x9cparameter transfer.xe2x80x9d When the procedure has been processed, the called procedure normally transfers to the calling program xe2x80x9cresult parametersxe2x80x9d which contain an information item determined by the task of the procedure, e.g., an order confirmation, an error message or a value calculated or ascertained by the procedure. In this context, the value of the calculated result parameters frequently depends on the values of the call parameters.
However, procedures having formal parameters are more susceptible to error than procedures without formal parameters, on account of the greater degree of flexibility. They therefore need to be tested for freedom from error with relatively great care when programs are tested. In this context, two potential sources of error are of particular significance:
(1) The values of the result parameters of the procedures are generally not defined for all possible value combinations of the call parameters. It is therefore necessary to test whether the procedures are always called with permissible value combinations of the call parameters; and
(2) The procedures may have logical coding errors. It is therefore necessary to test whether the procedures deliver the desired result for each permissible combination of input values.
Thus, for the program test, i.e., searching for errors in the programs, logging of the call and result parameters of the procedures is of central importance for checking the procedures.
However, since the complexity of contemporary programs means that a full test for freedom from error is often not possible, even tested programs frequently still have errors which can arise during execution of the programs. In this context, the described central role of the operating system means that errors in the procedures of the operating system programs generally have greater consequences than errors in the user programs; e.g., failure of the device and of all the user programs running on it.
The normal method for finding errors is to log the execution of the procedures of a program using an execution tracking program; also called a xe2x80x9cdebugger.xe2x80x9d In this context, execution of the program is tracked step by step under the control of the debugger. In this case, program execution is stopped by the debugger after each step, for example, and data stored in the memory cells of the main memory, e.g., the call and result parameters, can be displayed. However, a program is generally executed without the control of a debugger, since a debugger slows down the execution time of a program significantly; e.g., by a factor of ten. In addition, errors which arise sporadically, for example, frequently no longer appear when a program is executed under the control of a debugger on account of the program being executed more slowly over time. In addition, searching for errors using a debugger requires that the erroneous program be terminated and then started together with a debugger, which results in the program being interrupted. This method is therefore unsuitable for finding errors in operating system programs, since the operating system programs normally must not be terminated and restarted on account of operation of the device usually being interrupted in conjunction with this. This is particularly undesirable in devices for connecting telephone calls.
It is therefore necessary to track the execution of a program during correct execution. This is also called xe2x80x9ctracingxe2x80x9d in the expert field. Normally, special machine commands are inserted into the procedures of the programs for this purpose. The machine commands specifically log particular data, normally the call and result parameters of the procedures, e.g., in a particular memory area in the main memory, or a file on the background store. The drawback of this, however, is that other data which may be additionally required for finding an error are not logged if the special machine commands contain no provision for these data to be logged. In addition, logging slows down execution of the program even when no logging is necessary. This is particularly disadvantageous in operating system procedures since slowing down the operating system slows down all the user programs managed by the operating system, even those which are executed without errors.
The technical background described above is disclosed in Engesser, Hermann [publishers]; Claus, Volker [editors]; Duden xe2x80x9cInforinatikxe2x80x9d [Information Technology]; 2nd edition; Mannheim; Leipzig; Vienna; Zurich: Dudenverlag; 1993; ISBN 3-411-05232-5. See, in particular, page 8 f (Execution log, Trace), page 83 ff (Operating mode), page 86 (Operating means), page 86 ff (Operating system), page 188 (Debugger), page 457 ff (Parallel operation), page 557 ff (Procedure), page 559 ff (Processes), page 720 ff (Testing) and page 756 f (Interruption).
The present invention is thus directed toward developing a method of the type mentioned in the introduction such that the tracing of data is improved.
The present invention can, in essence, be regarded as being that a first procedure, also called trace procedure, takes a first information item, also called trace control data, from a data structure stored in the memory of a device, uses the trace control data to ascertain at least one memory area, and traces at least the data stored in the memory area. A fundamental advantage of the method according to the present invention arises from the fact that the separation of the trace control data from the trace procedure allows for the data to be traced by the trace procedure to be stipulated very flexibly and without retranslating the trace procedure. In particular, during tracing, other data additionally can be logged by extending the trace control data if, by way of example, the error search detects that the other data need to be traced for the purpose of diagnosing the error. In addition, tracing also can be stopped without interrupting operation, by virtue of the trace control data stipulating no memory areas to be traced. This is particularly advantageous in operating system procedures since, in this case, all the user programs are executed more quickly when tracing is not carried out. In addition, gradually reducing the extent of the data to be traced or selectively tracing very little data may make it possible, in the case of sporadically arising errors, to ensure that these errors continue to arise provided that the specific tracing slows down execution of the operating system procedures over time only minimally.
In accordance with one embodiment of the method according to the present invention, the memory area is defined by a statement, contained in the first information item, indicating the memory address of the first memory cell and the number of memory cells. In accordance with another embodiment of the method according to the present invention, the number of memory cells is defined by indicating the type of data stored in the memory area. Thus, the call and result parameters can be indicated in a particularly simple manner during the error search, since both the memory address of these parameters and their data type are normally known.
In accordance with another embodiment of the method according to the present invention, the memory area is defined by a statement, contained in the first information item, indicating the memory addresses of the first memory cell and the last memory cell. This advantageously allows any memory areas to be indicated. This is of particular advantage, for example, when data having different data types are stored in the memory area, and consequently one length statement per data type is not possible.
In accordance with a further development of the method according to the present invention, an operating system which operates the device and contains at least the trace procedure and a second procedure (also called SVC procedure below) calls the trace procedure instead of the SVC procedure when the SVC procedure is called. Thus, the trace procedure is advantageously executed even though the caller has called the SVC procedure. As such, the calling program need not be altered, which advantageously eliminates retranslation of the calling program, which is normally necessary for this purpose. This is particularly advantageous when the calling program cannot be translated; for example, on account of program source files being unavailable.
In accordance with another embodiment of the method according to the present invention, the SVC procedure is called by the trace procedure. As such, the service prompted by the SVC procedure is advantageously provided even when there is a branch to the trace procedure.
In accordance with yet another embodiment of the method according to the present invention, at least the data stored in the memory area are traced on the basis of a second information item, likewise called trace control data below, which is contained in the data structure and defines the instant of tracing of the memory area, before and/or after the second procedure is called. The extent of the data which are to be traced can be controlled as required. Thus, by way of example, the call parameters transferred by the program which actually calls the SVC procedure are of interest only before the call, and the result parameters formed by the SVC procedure are of interest only after the SVC procedure is called.
In accordance with one embodiment of the method according to the present invention, the device is controlled by the operating system such that the data structure can be accessed only by the operating system. Accordingly, the user software can alter the trace control data neither unintentionally, e.g., as a result of a program error in the user software, nor intentionally, e.g., in order to modify the operating system via a computer virus.
In accordance with another embodiment of the method according to the present invention, the trace control data are modified by the operating system on the basis of a control information item. This is advantageous because it is frequently only during the error analysis that it becomes obvious which data are to be traced at what time.
In accordance with a further embodiment of the method according to the present invention, the operating system makes a setting, on the basis of a further control information item, such that the trace procedure is called instead of the SVC procedure when the SVC procedure is called. This allows selective setting of whether or not the SVC procedure is to be traced. If tracing is deactivated, the processing speed of the operating system is increased on account of the fact that tracing does not take place.
In accordance with yet another embodiment of the method according to the present invention, the modifications made by the operating system are made during operation of the device. As such, at any instant during operation of the device, the error search can be set, flexibly according to requirements, to specify which data are traced by which procedure of the operating system.
Additional features and advantages of the present invention are described in, and will become apparent from, the following detailed description of the preferred embodiments and the drawings.