1. Statement of the Technical Field
The present invention relates to the field of software testing and more particularly to testing the handling of tokens of various lengths in software code.
2. Description of the Related Art
Software modules which process Internet protocols including the hypertext transfer protocol (HTTP), the lightweight directory access protocol (LDAP), the simple mail transport protocol (SMTP), the post office protocol version three (POP3) and so forth, typically perform a significant amount of parsing of character and numeric strings, and accordingly, include a proportionate segment of source code dedicated to the same. Most Internet protocols, however, do not constrain the length of tokens involved in processing input data according to the respective protocols. Therefore, code which has been designed to process Internet protocol data must be prepared to handle input tokens of arbitrary lengths.
Yet, software engineers charged with the development of software modules intended to process Internet protocols often code to a “mental model” of what is to be considered a reasonable length of an input token. Consequently, software engineers may neglect to design such code to properly handle unusually long, unusually short, or missing input tokens. This problem can be exacerbated by the fact that a given token can be parsed or otherwise processed in many different layers of code within a software module. In this regard, each layer or combination of layers could have been developed by different software engineers.
Preferably, the testing process for such a software module ought to locate all code paths within the module which cannot gracefully handle unexpected input token lengths. In that regard, generally, the software module ought to be tested by varying the length of each individual input token that can be included in input data from zero to a desired upper bound. Yet, as any one given token can be processed by multiple code paths in different layers of a software module, it can be extremely difficult for the tester of the module to identify all particular token lengths which will cause the software module to return different responses to the input data. In particular, where the developer of one layer has a competing view of “reasonably sized token” with the view of another developer of another layer in the software module, unexpected results can occur.
As an example, one code path in a software module can correctly handle a token having a length which ranges from one (1) to five-hundred and twelve (512). Yet, the code path can crash for any other length, including zero (0). In another code path within the same module, however, a minimum length of ten (10) and a maximum length of sixty-four kilobytes (65,535) may be acceptable. Consequently, it would not be appropriate to test the software module using only a few, randomly selected lengths for input tokens. To do so would compromise the ability of the test to detect points of failure in particular code paths. Notwithstanding, the cost of testing every possible input token length from zero (0) to sixty-four kilobytes can prove too unwieldy or costly to be practical.
Improper handling of particular token lengths in a code path within a software module can cause the software module to crash or otherwise to fail in an ungraceful manner. In addition, faulty code paths can be vulnerable to buffer-overflow attacks and other forms of abuse by malicious agents. Buffer-overflow attacks have become a daily occurrence in the corporate information technology arena and have become a significant source of embarrassment and financial liability for software companies. Accordingly, there remains a long-felt, unsolved need for a software testing system and method in which faulty token-processing code paths can be efficiently identified within the time and resource constraints common to the software industry.