Although improvements are continually being made in software-development tools, it is still virtually impossible to write completely error-free code. However, a number of sophisticated debugging, protection, repairing and testing techniques exist to help keep these problems to a minimum.
A type error is defined as an error in a program resulting from a mismatch between the form or classification of a value encountered during program execution and that anticipated by the program. Type errors tend to manifest themselves as data value type errors, resulting from, for example, adding two values of different types such as salary and age, or from format differences, e.g., two-digit versus four-digit dates.
For example, a type error problem in a computer program is said to occur when arithmetic performed on data values results in an answer that is too big to be held in the amount of space allocated for the result. For example, a large number of programs written in the 1970s and 1980s allocated two digits to hold a date variable, so that the amount of storage space consumed by date variables would be minimized. The year "1979" would be represented as "79". The "19" would be implicitly assumed. Unfortunately, this method results in serious difficulties as the year 2000 approaches. The problem is that such programs do not distinguish between dates whose first two digits are different: for example, "1979" and "2079". A program might further perform arithmetic on two dates, for example, adding 30 years to the year 1979 represented by "79". When "79" and "30" are added, the result expected by the user is "2009". However, the computer program will produce "09" as its answer since its internal representation of dates uses only two digits. The answer is clearly ambiguous and can result in catastrophic problems if incorrectly interpreted as "1909".
There are many other real-life examples of type error problems. For example, many computer programs represent telephone area codes as three-digit numbers. If in the future, increased demand necessitates four-digit area codes, type error problems will occur. A user might wish to specify a four-digit number, while the computer program accepts only three-digit numbers.
As another example of type error, suppose a program customarily accepts financial data in terms of some currency such as the German mark, which is then replaced by a new European currency. The user would like to specify the new currency, but the program accepts only the older mark.
As yet another type error example, suppose a program accesses a set of data in some environment. When executed in a different environment, a problem occurs when the program is able to access data from files beyond its permissible limit.
Testing for, protecting against, and repairing type errors in computer programs is a difficult task. For an idea of the magnitude of the problem, consider the date problem. There are approximately 500 billion lines of Cobol code in the world. Some fraction of this code is contaminated with the date type error problem. Fixing this problem alone has spawned a large industry in the United States and elsewhere.
While this discussion is focused on the date problem and its solutions, the present invention addresses a much broader class of problems.
By far the predominant method of addressing the type error problem is to manually fix the source programs. This method involves a team of programmers laboriously perusing the source code, finding all locations where, for example, a date variable might be operated upon, and then modifying the code so that the problem is fixed. Many methods exist for fixing source code in this manner.
One method called "expansion" involves expanding all date variable fields to use four digit arithmetic instead of two, and modifying all pertinent instructions to use four-digit arithmetic instead of two-digit arithmetic. This method also requires that all input and output routines correctly handle four digits.
A second method called "windowing" modifies the logic of the program to operate correctly without resorting to longer date fields. This modification might change the program so that all two-digit numbers smaller than 50 are interpreted as being years after or equal to the year 2000, and all two-digit numbers greater than or equal to 50 as being before 2000. For example, the two-digit number "39" would be interpreted as 2039, while the number "79" would be interpreted as 1979. As an example of program logic modification to accomplish this, consider the two dates date1 and date2 assigned values of "04" and "96" respectively, and intended to be interpreted as 2004 and 1996. Suppose the original program subtracted the variable date2 from date1. The original faulty program might naively subtract 96 from 04, resulting in an incorrect number such as -92 years. (Note that a subtraction such as "99" minus "96" would have produced a correct result of 3 years.)
A program modified according to the windowing technique, would produce the current result of 8 years if it saw the pair of inputs 04 and 96. Similarly, the modified program would still produce the correct result of 3 years if it saw the pair of inputs 99 and 96.
In either the windowing or expansion technique, the simplest methods require searching through all of the source code, or using some dynamic method to track corrupted values. One approach to reducing the search space uses program coloring and works as follows. A user might be required to submit the names of all variables that might contain a date. A program flow analysis at the source program level then identifies all regions in the program where data from the named variables might flow and thereby have an effect. The regions of the program where the named variables might have an effect are designated as "colored" regions. The programmer need only look at the colored regions to implement the fixes.
Some methods automatically transform source code so the resulting source code is correct. The automatic method working at the source code level might transform automatically all code sequences to use the correct type of windowing logic.