The present invention relates to a method for deleting unnecessary data in which the process of a main program alternates with the scanning of a data recording region having recorded therein data used in carrying out the main program, to data deleting apparatus suitable for this method, and to a recording medium having recorded thereon programs for realizing this apparatus. More particularly, the present invention relates to a data deleting method, a data deleting apparatus, and recording medium for deleting unnecessary data generated in application software programs created using programming languages which create data dynamically, and which are required to be carried out in real time.
In application software programs (referred to as main program hereinafter) created using programming languages which create data dynamically, such as Java, C++, Lisp and Prolog, the data created and used once during processing of the main program sometimes become unnecessary as the main program proceeds.
In view of this, in order to effectively use a data recording region wherein data is recorded, this unnecessary data must be deleted from the data recording region, and after this data is deleted the data region must be reused for recording other data. This process in which data is recorded and then the recording region is reused is called “garbage collection” (referred to as GC hereinafter), and this is a process which is indispensable to the main program.
When the main program for which real time property is of great importance is processed, the GC must be carried out substantially in parallel with the main program, and such methods which have been proposed or actually used include On-the-fly GC, duplicating type incremental GC, and snapshot GC.
On-the-fly GC is a method in which a processor which processes the main program is separately provided from a processor dedicated to the GC program, and both processors are operated in parallel and the GC program can be executed in real time, and this method applies algorithms such as that conceived by Dijkstra in which, while observing the indications of a pointer which indicates data recording region by the main program, marking of the data recording region indicated by the pointed is carried out and the unmarked data in the unnecessary data region is deleted, as well as the algorithm conceived by Steele in which compression processes are used.
The duplicating type incremental GC is a method in which an algorithm designed by Baker is applied. In this method two data recording regions (from space and to space) are secured from the usable recording regions, and only one data recording region (from space) is used to carry out the process of the main program. The usable data which is indicated by the pointer is copied to the other data region (to space) and the data in this other data region is deleted.
However, in both of the On-the-fly GC and the duplicating type incremental GC, large overhead costs are incurred in carrying out the program, and further it becomes necessary to add or secure hardware resources such as processors, recording regions and the like and thus practical application as a general use device was difficult.
As a result, the snapshot GC which was conceived by the Yuasa who is the inventor of the present invention, like Dijkstra, is based on the algorithm in which marking of a data region is carried out and then this data region is deleted, and the object was to simply realize a general use device, and this is widely used at present due to this excellence.
In the snapshot GC, the main program is stopped, a function recording region for the function necessary for process of the main program and a static recording region which determines in advance the locations other than the function recording regions, are scanned. Advance marking (marking in advance) for identifying data for protection in the data recording region which is directly or indirectly indicated by the pointer which has been recorded in the functional recording region which has been scanned is carried out. Further, marking is carried out in order to protect the data in the recording region which has been marked in advance. In addition the method is one in which an algorithm is applied in which data recorded in data recording regions which have not been marked is deleted.
This type of snapshot GC process will be described using the drawings. FIGS. 10A to 10C conceptually illustrate the function recording region for the function which is necessary for executing the main program. The main program is generally carried out when the function is in a called state. As shown in FIGS. 10A, 10B and 10C, the function necessary for the process of the main program calls the function to be executed from among those functions being executed, and a new function region is secured in a stack in the function recording region having the function and which has been secured in the stack region in the memory, and the function which has been called is executed. After the execution of the function is complete, a process is carried out such that the stacked functional recording region which has become unnecessary is destroyed.
The function region for the function being executed (function frame) is called the current frame, and in FIG. 10A the function F is being executed and the function recording region for the function F is the current frame. When the function G which is different from the function F is called, as shown in FIG. 10B, the function recording region for executing the function G is stacked in the stack region and becomes the current frame, and when the execution of the function G is completed, the corresponding function recording region is destroyed and it returns to the state prior to the calling shown in FIG. 10A.
It is to be noted that in FIG. 10B, when yet another function H is called, as shown in FIG. 10C, the function recording region for executing function H is stacked in the stack region and becomes the current frame. When the execution of the function H ends, the corresponding function recording region is eliminated and it returns to the state prior to the calling shown in FIG. 10B.
FIG. 11 is an explanatory chart which conceptually illustrates the scanning and marking process of a conventional method for data deletion (snapshot GC). It is to be noted that in FIG. 11, lines which slant downward to the left illustrate the function recording regions which have been scanned, and the lines which slant downward to the right illustrate the function recording regions which have not been scanned. Scanning of the pointer recorded in the function recording region is carried out from the function recording region at the highest position, toward the function recording region at the lowest position which is the origin function, and advance marking is carried out for the data recording region which is indicated by the pointer which was scanned. Then marking is executed in the data recording region in which advance marking was carried out, and the data recorded in the data recording region which has not been marked is considered unnecessary data and thus deleted.
However, during the snapshot GC process, because there is a small number of recorded pointers in the static recording region, the scanning and the advance marking of the static recording region is completed within a short time, but the scanning and advance marking of the function recording region is not necessarily completed in a short time because the function recording regions which are to be scanned fluctuate.
However, because the snapshot GC interrupts the main program and executes the snapshot GC processes, in order to maintain real time property when carrying out the GC program, the processes related to the GC program must be divided and carried out.
Because the process of executing the scanning and advance marking of the static recording region is completed in a short time, it may be executed while the main program is interrupted and further, even when the process for marking and deleting the data in the data recording region which has been marked in advance is not completed in a short time, because it is possible to divide the processes, neither of these processes cause any big problems.
However, because the scanning and advance marking of function recording region may take a long time, there is the problem that real time property may be lost.
Thus, it may be thought that if the process of scanning and advance marking of function recording region is divided, and each time the program is interrupted, it scanned little by little and the time of each interruption becomes shorter and the real time property of the main program is maintained. However, in this case, process abnormalities such as those exemplified below may be generated.
FIG. 12 is an explanatory chart which conceptually illustrates the scanning and marking process of a conventional method for data deletion (snapshot GC). As in FIG. 11 the lines which slant downward to the left illustrate the function recording region which has been scanned, and the lines which slant downward to the right illustrate the function recording region which has not been scanned. In addition, the state shown in FIG. 12A, the assumption is made that the GC process is interrupted and the main program is executed, and it is then transferred to the state shown in 12B. In the state shown in 12A, the data recording region a has been marked in advance, but the data recording regions b and c are not marked in advance.
The main program is executed from the state shown in FIG. 12A, and in the case where it is transferred to the state in FIG. 12B, the data recording region b is indicated but the function recording region having recorded therein the pointer which indicates the data recording region b is changed from an unscanned function recording region to a scanned function recording region, and when the GC process resumes, because advance marking is not carried out in the data recording region b (because the failure of the advance marking has been generated), the data is not protected and thus deleted. As a result, when due to execution of the main program, the process which requires the data recording region b is executed, because that data has been deleted, processing abnormalities occur.
The following describes the conditions in which this type of failure is generated. FIGS. 13A and 13B are explanatory charts which conceptually illustrates the scanning and marking process of a conventional method for data deletion (snapshot GC). In general, during execution of the function, not all of the function recording regions, but only the function recording region inside of the current frame is referred to.
FIG. 13A shows the state in which the current frame includes the scanned function recording region, and in the case of this state, even if the state of the stack is changed due the process of the main program, because only the scanned function recording region is affected, failure is not generated.
FIG. 13B shows the state in which the current frame includes the unscanned function recording region, and when the stack is changed due to the process of the main program, the pointer which indicates a specific data recording region moves to the scanned function recording region, and thus there is the possibility that failure is generated.
In addition to the processing abnormalities generated in executing the basic processes described above, in the case of a language system which supports local functions, there is the problem that the conditions exemplified below must also be considered.
FIG. 14 is an explanatory chart which conceptually illustrates the scanning of a conventional method for data deletion (snapshot GC). FIG. 14 assumes a simple Common Lisp code in the following, which includes local functions.
(defun F(x)(labels((G(y z)...(G z y)...))(G x x)))
That is to say, in the function F, the local function G is defined and from function G, function G itself is called reflexively. Further, the variable x which appears in this reflexive calling system is a parameter which is defined by the function F and it exists in the function recording region F. For this reason, while function G is executed, not only the function recording region of function G, but also the function recording region for function F which is at a position lower than this function recording region is also referred to, and generally this is realized because the function recording region for function G stores a static link which makes it possible to refer to the function recording region for function F. In this case, as shown in FIG. 14, a situation is generated in which the function recording region for function G which has already been scanned is referred to, and then the function recording region for function F which has not been scanned is referred to and this may cause various inconveniences.
Further, there is the problem that the conditions must be considered in which due to unusual processing such as interruption or other intercepting commands while the main program is being executed, or due to general escape, the function to be executed changes from the state of being executed to being a function at a lower position which is to become the return function and one which is not called by the function.
FIGS. 15A and 15B are explanatory charts which conceptually illustrates the scanning of a conventional method for data deletion (snapshot GC). As illustrated in FIG. 15A the condition which is to be considered is, after the execution of function H ends due to unusual processing or due to general escape, during the execution of function H, it is not function G which is called by function H, but rather, as shown in FIG. 15B, the function called throw which executes the function F which is at a lower position than the return function, that is executed.
When throw is executed, the function recording region for the function H has been already scanned, but the situation is sometimes such that the function recording region for the function G and the function recording region for the function F are not scanned, and the problem remains of how to carry out the process of that time and for example whether a language system which supports general escape should be installed.