1. Field of the Invention
The present invention pertains to the field of software correction. More particularly, the present invention pertains to locating bugs using a list of files (a xe2x80x9cbatchxe2x80x9d file) which makes up an input stream for a processor.
2. Description of the Related Art
A common problem encountered in the field of computer engineering is the occurrence of bugs in a part of a computer system, whether hardware or software. For purposes of discussion here, the emphasis will be on software, although it may be applicable to bugs in hardware as well. As used herein, software means information of any kind that can be processed by a processor, and may include data, instructions, or both; and a bug means a portion of software, or an interaction between portions of software, which cause a processor which processes the software to yield erroneous results or to malfunction.
One type of bug which is particularly difficult to locate occurs while processing batch files. When a batch file is processed in sequence, a bug may be witnessed while a particular file (witnessing file) is being processed, but be caused by some processing of predecessor files. Processing of a predecessor file may, for example, cause the witnessing file to crash by allowing data to be overwritten at memory addresses allocated to the witnessing file. Thus, the witnessing file may crash even if it would be properly processed by itself.
Identifying such bugs by conventional means is very difficult. Exhaustive testing is highly impractical because bug frequency typically increases at about an exponential rate relative to the amount of software under scrutiny. Further, conventional tools for locating bugs typically test an individual program, or a set of linked programs, and usually are not well suited at identifying bugs resulting from processing of batch files. To locate a bug, the computer user must search through a lengthy sequence of files, many of which are not involved in causing the bug. This is cumbersome and time consuming even for skilled programmers, and requires significant programming expertise.
There is thus a continuing need in the area of computer engineering for an improved tool which locates and identifies bugs resulting from processing of batch files. Applicability of such tools to identifying such bugs resulting from a batch of print files is highly desired. Such tools preferably should yield a minimal boundary range or minimal list of files for each bug.
A bug location system and method are presented according to the present invention for identifying one or more bugs resulting from a plurality of files. In a preferred mode of operation, the files comprise an input stream for a printer, and the bugs comprise printing bugs which cause the printer to malfunction when the files are printed. However, the bug location system and method are generally applicable to various types of bugs which may be encountered on various computer or processor platforms.
According to one aspect of the present invention, a printer emulator is implemented as a software program which is run on a processor. Several data structures are used with the printer emulator which allow the processor to implement the bug location method in accordance with the present invention. The processor on which the printer emulator and data structures operate preferably is the central processing unit (CPU) of a general purpose computer or the processing unit of a printer.
According to another aspect of the present invention, a given sequence of files results in a bug, and the bug location system and method determine a smaller sequence of these files which results in the same bug. Bugs are identified as being the same in these sequences according to various indicators. One such indicator is that both sequences crash on the same line of code during processing of a of a particular witnessing file when run with the same options. A further or alternative indicator comprises both sequences having substantially identical core dump traces upon crash or other failure. In accordance with the present invention, the bug is considered to be the same even though some or all of the files not in the smaller sequence of files might contribute to anomalies when the larger sequence of files is processed. This beneficially aids in identifying bugs to sufficient extent that they can be quickly corrected.
According to another aspect of the present invention, a given sequence of files results in a bug, and the bug location system and method determine a smaller sequence of these files that results in the bug and that includes all files of the given sequence between the first and last file of the smaller sequence. Preferably if files are excluded from the beginning or end of this smaller sequence, then the resulting sequence will not result in the bug. Such a sequence is referred to as a minimal boundary range for the bug. When the given sequence is already a minimal boundary range for the bug, then the determined sequence typically will be this minimal boundary range.
According to another aspect of the present invention, a given sequence of files results in a bug and the bug location system and method determine a smaller sequence of the files that result in the bug and that cannot be any smaller. That is, if any files are excluded from this sequence, then the resulting sequence will not result in the bug. Such a sequence is referred to as a minimal list for the bug. When the given sequence is already a minimal list for the bug, then the determined sequence typically will be this minimal list. To speed up convergence on a minimal list for the bug, the given sequence is expected to be a minimal boundary range for the bug, and may be obtained using the bug location method as described above.
According to another aspect of the present invention, the bug location system and method locate a bug resulting from a sequence of files according to the following steps. The bug location method selects and excludes a portion of the files from the sequence of files. The bug location method then determines whether the sequence results in the bug. If not, the bug location method returns a portion of the excluded files to the sequence, and the sequence of files is then tested to determine if it results in the bug. This is repeated until the sequence results in the bug.
According to another aspect of the present invention, if a portion of the files in such a sequence are selected and excluded, and the sequence still results in the bug, additional portions of files in the sequence are iteratively excluded until the sequence no longer results in the bug or no more files can be excluded. At that point, the excluded files are returned to the sequence until the sequence again results in the bug. These steps are repeated as desired to further reduce the length of the sequence. Where the last file of the sequence witnesses the bug, files preferably are excluded from the beginning of the sequence. About half of the files preferably are excluded from the sequence during the first excluding step. About half as many files preferably are returned during each returning step as were either excluded or returned in the most recent step which is either an excluding step or a returning step. Except for a first excluding step, preferably about half as many files are excluded during each excluding step as were either excluded or returned in the most recent step which is either an excluding step or a returning step. Rapid binary convergence on a minimal boundary range for the bug is thus provided.
According to another aspect of the present invention, the bug location method locates a bug caused by processing a sequence of files according to the following steps. The bug location method selects and excludes a file from the sequence, and then determines whether the bug results from the sequence. If not, the file is returned to the sequence. These steps preferably are repeated until a minimal list for the bug is obtained. Very rapid convergence on a minimal list for the bug is achieved by performing the bug location method on an original sequence in the binary conversion sense described above to determine a minimal boundary range for the bug, and then performing the bug location method on the minimal boundary range excluding a single file during each excluding step to determine a minimal list for the bug. Where the bug location method is applied to a minimal boundary range for the bug, preferably each file other than the witnessing file and first file is selected in turn.
According to another aspect of the present invention, the step of selecting a file to exclude from the sequence comprises selecting a next to last file of the sequence a first time that this step is performed, and then selecting a file immediately preceding a current selected file each succeeding time that this step is performed. This preferably is repeated until all but the first file of the sequence have been selected.
According to another aspect of the present invention, the bug location system is implemented on a computer. A data structure capable of directing the computer to perform the bug location method is stored in a computer readable medium, such as a hard disk drive, which the computer reads and processes, and in response thereto, performs the bug location method in accordance with the present invention.
The bug location system and method according to the present invention beneficially determine a minimal boundary range and minimal list for a bug. This simplifies the generally difficult task of identifying a bug resulting from processing of a large sequence of files. The bug location system and method exclude from consideration files not required for the bug to occur. This helps to distinguish what causes the bug, which can prove valuable in determining how the printer or printer emulator might be modified to correct or avoid the bug when processing the files. The second bug location method (i.e., the one determining a minimal list) operates in second order polynomial time with respect to the number of files under test. As the number of files is increased, time expended by the present invention to identify bugs resulting from processing of such files increases at most as a second order polynomial function of the number of files under test. The present invention thus is not as limited by the slow down encountered in conventional bug location systems.
The features and advantages described in the specification are not all-inclusive, and particularly, many additional features and advantages will be apparent to one of ordinary skill in the art in view of the drawings, specification, and claims thereof. Moreover, it should be noted that the language used in the specification has been selected principally for readability and instructional purposes, and may not have been selected to delineate or circumscribe the inventive subject matter, resort to the claims being necessary to determine such inventive subject matter.