As the year 2000 approaches, there has been a significant increase in concern over computer programs being "Year 2000 Compliant". Many computer programs have been written using only the last two digits of the year in various ways such that these computer code fragments will fail or produce incorrect results when entering "00" or "0". Such would be the case when calculating a person's age using the two digit birth year and subtracting from the current year, 0-75=-75. However, there are a near infinite number of ways that a two digit date could be used in a computer program such that incorrect results would be produced by entering only the last digits of the year at the turn of the century. The computer industry has reacted to this problem by allocating significant resources in terms of personnel and finances due to the extreme diversity of the problem. The problems encountered in the computing environment are very diverse due to many factors such as: the number of operating systems, computer languages, and types of applications in use. However, all of the factors associated with correcting the date related problem at the turn of the century could be divided into four categories: inventory, analysis, remediation, and verification.
The inventory is the list of all of the source code files of a particular system or sub-system that are required to rebuild the executing program(s). In some cases the original source code is missing or incomplete. The process of obtaining a complete set of files is subject to human error and not required by existing source code remediation solutions to begin the analysis.
The goal of the analysis is to detect only code fragments that are of interest because a date is manipulated within the computer program. This is particularly difficult since there is no way to be certain that all of the date manipulations can be located using only date related character strings. Attempts to automate the process of finding code fragments involve using a "seed list" of character sequences relating to date fields and sequentially scanning source code files. Usually, additional character sequences are discovered in the process of scanning all of the source code files. These newly discovered character sequences are then added to the seed list and re-applied against all of the source code in an iterative fashion of unpredictable duration. One of the major problems with this approach is that computer programs are typically written to assign a code fragment of interest to another variable, location or function. This could result in a failure to detect the redefined code fragments. Another problem is that this method does not verify that all of the source files are present before scanning the source files using the seed list. Hence, it is often late in the project when it is discovered that source code is missing or out of date.
Remediation is defined as the modification of a code fragment into a desirable result that corrects the identified problem. Once identified, a code fragment is subject to multiple forms of correction. This may be accomplished simply by presenting the code fragment visually or in report form and allowing manual correction one after another. Another method is to apply a set of rules particular to a specific code fragment automatically or provide one or more alternatives for user selection. However, none of the existing remediation methodologies have the capability to locate the optimal remediation points within the source code fragments. Hence, a variable named "date" may be in the comments, part of an assignment, or any one of dozens of computer constructs and all could be traversed before locating its definition which is most likely the best location to correct the problem. Worse, correcting a problem at the wrong location within the processing could compound the work effort required to actually remediate or correct the identified code fragment.
Verification of the remediated code fragment changes, which have been either manually or automatically generated, requires that said code fragments be executed in direct testing. Verification is generally overlooked in existing remediation methodologies. Currently, this requires comprehensive testing of all of the program functions since it unknown at the user level if the testing actually traversed all of the newly modified code fragments. However, in practice, the data entry fields associated with dates is all that is tested. Additionally, a modified code fragment could be traversed when no date was actually entered as in the case of program initialization. Computer programmers know this and generally do some level of additional testing to expose these hidden potential problems. Now the problem becomes to determine when to stop testing since there are no metrics regarding the percentage of completion other than the total of all possible tests that could be run. It is also possible that a code fragment modification could have been made without having a specific test case to traverse the newly modified code fragment. Inadequate or incomplete testing and verification has plagued a number of the year 2000 conversions after the systems were actually installed.
The technical challenge is to find an automated method to identify computer code fragments having only partially defined search criteria and the subsequent remediation and verification of said computer code fragments while minimizing the introduction of human errors and reducing the total effort required throughout the entire process.