As software programs become larger and more powerful, their robustness becomes increasingly important in the overall performance of the whole computer system. Consequently, it is essential that the quality of the software be measurable to provide the programmer with an indication of the robustness of the software. One indication of the quality is the structure of the program. A program is generally broken into a set of software modules. A program is deemed to be well-structured if there is good cohesion within a software module and there is loose coupling between the different software modules.
Cohesion describes the interrelationship between functions and variables. Functions and variables are interrelated in the program in the sense that functions can invoke other functions, and functions can access variables. As large programs are segregated into discrete sections known as modules, a problem arises when functions and variables are spread throughout the program and are being called by different functions in different modules. In the situation where a function accesses many variables or where many different functions access any one variable, it is difficult for a programmer reading the program to understand how the consistency of any one variable is maintained. Consequently, good cohesion is desirable, wherein functions in module A invoke other functions or access variables contained with module A, and do not make many references to functions or variables in other modules.
However, this leads to a problem of measuring the degree of conformance of a large software system to the principles of maximal cohesion and minimal coupling. The size of large software systems makes manual evaluation impractical, and subjective evaluations are vulnerable to bias. Typically, software metrics are used to provide an indirect measure of selected "quality" attributes of a software product or process. Software metrics can estimate the presence or value of the desired attribute from real-world inputs that are relatively easy to measure, however, which may be difficult to measure directly in the program. Software metrics can also be used to make predictions of the future value of the desired attribute based on inputs and parameters that are specified in the present. The expectation is that metrics can provide useful feedback to software designers or programmers as to the impact of decisions made during coding, design, architecture, or specification. Without such feedback, many decisions must be ad hoc.
As shown in FIG. 1, a software metric 100 is evaluating program 101. The software metric 100 requires a specification of the actual inputs 104 that are supplied to the program 101, for example, lines of source code and/or historical defect rates. The metric also includes a metric model 102 that maps inputs 104 and parameters 103 to the metric, for example, defect rate=function (lines of source, historical defect rate). The specification of parameters 103 allows for the adjustment of the model 102. The metric model would output 105 a predicted value of the desired attribute, for example the predicted defect rate of a software system. The comparator 106 would allow the comparison of the predicted metric output 105 with the actual output 107, for example, the actual defect rate. This allows for the metric model to be empirically validated. If this cannot be done, the metric is metaphysical in nature. The results 108 of the comparison allows a programmer to change the parameters 103 to fine tune the model 102.
The standard metric that is generally used to compare the results of other metric tools is the lines of code metric. The lines of code metric states that the more lines of code that are in a particular program, then the more complex the particular program. Other prior art metrics generally perform poorly, as shown when their results are correlated with some other property of the code which is actually measurable or when compared against the lines of code metric. One reason for this, is that prior art metrics tend to lack clear definitions of terms. For example, "complexity" and "quality" are too ill-defined to be useful. Moreover, prior art metrics tend to have theoretical bases that are overly ambitious, inconsistent and/or unconvincing. Furthermore, prior art metrics lack empirical validation, have conflicting empirical results, or make predictions that are no more powerful than those produced by the lines of code metric. Research has shown that the prior art metrics do not provide any more information than the lines of code metric. The prior art metrics are largely ad hoc in the sense that they are based on conjectures about where complexity arises in software systems.
Therefore, there is a need in the art for a metric tool that provides an accurate measurement of the complexity of a software program, that also can be empirically validated. Such a software would have several applications in building and maintaining large software systems. The tool would allow for comparisons between different programs. Thus, if a user must choose between two implementations of a software system with essentially the same functionality, choosing the better structured one would provide lower costs of maintenance and extension. The tool would provide guidance in restructuring the software and thus minimize ad hoc decision making. The tool would also detect the structural degradation of programs as they evolve and grow from the addition of features and modules.