As software pervades more aspects of the everyday environment, it becomes more invisible to those that benefit from it. It therefore becomes more critical that the software operate correctly and reliably, since the consequences of failure can be far-reaching, and will involve more individuals that are ill-equipped to deal with software. As the complexity of software increases, it becomes more difficult to prove out the correctness of the software. Therefore tools and systems for analyzing the correctness and robustness of software programs may play an important role in helping software writers manage the quality of their software in the context of its complexity and its interaction with the wide variety of environments in which it may perform.
During the development of a sophisticated software program, analysis tools may be used on an ongoing basis to identify opportunities to make changes. These analysis tools may operate statically, by analyzing a software program in isolation, or dynamically, by analyzing a software program as it executes. Opportunities for program changes may represent outright errors, operational weaknesses, or areas that may prove difficult for others to understand when trying to maintain the software in the future, among other things. Such issues will hereinafter be referred to as individual defects. An analysis program may identify thousands of individual defects within a software program.
Defects represent an example of what may be identified by an analysis tool, but it may be appreciated that certain analysis tools may report on items that are not defects, and may more generally identify specific instances of patterns in the code; the discussion may apply equally to such analysis cases. The term “pattern” will be used hereinafter to indicate such specific instances of a pattern for discussion of concepts, although specific examples may involve analysis of defects. A specific instance of a pattern identified in a program will hereinafter be referred to as an individual pattern; an individual defect is one possible embodiment of such an individual pattern.
It may occur that a single issue or problem in the program have more than one apparent consequence in the program, yielding multiple individual patterns. In addition, a given individual pattern as identified by multiple runs of the analysis tool should be considered a single issue even though each run of the analysis tool will have identified a separate instance. For the purposes of managing the number of patterns, it may be useful to merge equivalent individual patterns according to some context-appropriate criteria, providing a single point of reference while still maintaining access to the individual patterns. These will hereinafter be referred to as merged patterns. The number of merged patterns, while still potentially very large, will by definition be smaller than the number of individual patterns, reducing the scope of the management problem. It may be appreciated, however, that the correct balance must be struck between eliminating multiple manifestations of a single problem within and across analysis runs and inadvertently merging different issues, which may result in the obscuring of one individual pattern by subsuming it under another. This latter tendency may be referred to as over-merging.
Merged patterns may be approximately divided into three categories: those that will be addressed by the time the program is complete; those that will not be addressed; and those that are actually mis-reported, so-called false-positive reports. These categories may vary, and may be further divided up into more precise descriptions. Upon running the analysis tool, the programmer will need to inspect each pattern and decide how to disposition the pattern. This process will hereinafter be referred to as triage. The pattern disposition will generally change throughout the project, as, for example, a given pattern is identified as one that must be addressed, and then eventually is addressed and closed out.
A development project may span many months or even years, involving hundreds of files distributed over a potentially complex network of computers, servers, and storage units. Some of those files may be renamed or moved between directories. Many or all of those files will undergo numerous revisions, and any such revisions may or may not resolve patterns discovered by an analysis tool, and any given revision may in fact create new patterns on a subsequent run of a given analysis tool. In addition, over the span of the project, the analysis tools themselves may undergo revisions, changing the manner in which they analyze the software program and merge individual patterns. Given the scope of pattern triage, it may be appreciated that it is critical that patterns be identified, merged, and managed in a manner that is relatively insensitive to changes in the program files and how and where they are stored, and that accommodates the upgrading of analysis tools that may involve analysis algorithm revisions and different merging techniques. Were such changes to affect the analysis results sufficiently, then the triage performed on prior runs would be nullified by a subsequent run, and would have to be redone, potentially for each run of the analysis tools. The impact of this would be a severe productivity reduction, or possibly reluctance by a user to upgrade analysis tools that might otherwise provide greater utility than the older version.
Analysis tools may be embedded in an overall environment that may include one or more databases for use in managing the history, status, and contents of the project. Within the database, it may be beneficial that all merged patterns be manageable as if in a single table. The details of whether the patterns are indeed in a single table or are in multiple tables that are merged through a query or some other mechanism are not material; the ability to view and/or manage all patterns as if collocated may improve the manageability of a project. Such databases and tables must be stable for the life of the project, so it may be appreciated that any changes to the analysis tools or environment that affect the structure of the database and/or table must be managed in a way that preserves existing information in the database and/or table.
Conventional methods for merging and managing patterns lack stable mechanisms for ensuring consistent pattern merging through the life of a project. In addition, the merging rules are specific enough that subtle changes in a new revision of the analysis tools may undo the merging and hence the triage from prior runs. It may be appreciated, therefore, that there remains a need for new more stable methods of merging that are durable in light of changes in file and directory naming, source code changes, analysis tool changes, and that resist over-merging. In addition, a need remains for methods that allow upgrading the merged pattern contents of a database in light of merging rule changes that preserve merged pattern triage results as much as possible, and manage in a predictable and understandable manner those merged patterns whose triage status needs to be changed.