Large software systems are generally developed systematically using structured development methodology. One common aspect is to organize the software into logical modules. Modules, ideally, are logical groupings of functions and related data structures that together offer a well-defined service. Correct module use promotes encapsulation. That is, implementation details are hidden, and interaction details with other modules (and the rest of the world) are consistently implemented through a set of published application programming interface (API) functions.
Modules themselves may be organized in layers. Layers, typically, are organized hierarchically such that modules in a given layer are only allowed to call modules in lower layers. Systems which are well modularized and which use layers properly are easier to understand as each change to the system only affects a small portion of the overall system. Thus, changes have less overall effect. For example, it takes less time to add new functionality, change existing functionality, and test and release the resulting modified modules even by those who are not conversant with the software.
Various measures of modularization have been proposed to determine modularization quality, such as cohesion metrics, coupling metrics, and combined cohesion/coupling metrics. Cohesion measures how self-contained code is, that is, how well the code in a module works together to provide a specific piece of functionality without the need for assistance from other modules. For example, cohesion can be measured as the ratio of the number of internal function-call dependencies that actually exist to the maximum possible internal dependencies.
Coupling measures the dependency of different code groupings; that is, within a module, to what degree changes in one function may affect other. Low coupling usually implies high cohesion, and vice versa. Ideally modules should have high cohesion (be self-contained) and low coupling (be relatively independent of other modules.)
One common metric, Modularization Quality (MQ), is a combined cohesion/coupling metric, and is calculated as the difference between the average cohesion and the average coupling.
Variations on the MQ which have been developed include a modified MQ which is the sum of modularization factors (MF) for clusters. For a given cluster, MF can be calculated as
      i          i      +                        1          /          2                *        j              ,where i is the sum of internal function-call dependency weights and j is the sum of external function-call dependency weights. The cluster factor for a module is expressed as the ration of a weighted sum of the functional-call dependencies internal to a module to the sum of internal and external dependency weights.
Metrics have also been derived from a relational graph representation of software artifacts such as functions, data types, variable, etc. Relations such as function-call dependencies, type declaration of variables, macro definitions, and so on are mapped between modules. This class of metrics measures the degree of association between a module and other modules.
Other metrics include measures of how many methods are packed into classes, the depth of inheritance trees, inheritance fan-out, and measures of the degree of couplings between objects, where couplings are created by one object invoking a method on another object.
Tools which have been developed for software clustering have also, albeit indirectly, proposed modularization metrics. Software clustering tools may be used to determine, or more usually, recover the underlying architecture of, a software system after some catastrophic event. These tools attempt to partition a software system into logical, cohesive subsystems. For example, automated software-partitioning tools have been developed which quantitatively characterize modules by the degree to which the functions packaged within the same module contain shared information, such as by determining the degree of commonality of the names of data objects within the functions. Modules can also be characterized on the basis of function-call dependencies, such that if a function A calls a function B, then functions A and B are presumed to belong to the same module. Other metrics have also been proposed. These include quantitatively characterizing the degree to which functions packaged within the same module contain shared information. Functions may share information, for example, on the commonality of the names of data objects.
Structural information, such as shared name substrings for functions and variables, function-call dependencies, and so on can also be used to determine modularization. Another approach is to try to isolate “omnipresent objects,” which are heavily-used object and functions. Analyses, such as those described earlier, relating to cohesion and coupling are then performed. This is done with the belief that omnipresent objects, unless specially categorized, distort the coupling/cohesion analysis.
However, these complexity metrics do not analyze the software from a coherent set of perspectives that together give an overall impression of how well the system is modularized from a variety of perspectives. That is, they do not analyze a system at large in terms of their modules, the cohesiveness of the modules, the interactions between modules, and how well the system has been divided into layers—super-modules. Furthermore, even the best-designed system can have its modularity degrade as the system ages and is maintained.
Even though a legacy application development can begin as a well modularized system with a layered architecture, it gradually degrades over a period of time as the system is maintained. The degradation in modularity can happen due to ad-hoc bug fixes and enhancements which disregard modularity design principles, such as have been discussed. Such degradation in modularity can cause a formerly robust system to become rigid and fragile.
In such a scenario, a manager who has responsibility for maintenance and enhancement of a legacy application, is not in a position to take appropriate corrective measures in spite of realizing the gradual degradation of the maintainability of the system if he or she has no way to accurately measure the type and location of the degradation or who, specifically, is causing the degradation. Because there is currently no good way to measure this degradation, there is similarly no good way to determine when the degradation occurs, or who or what is the likely cause of such degradation.
As the deterioration of modularity is not recognized, there is no reason for programmers, unaware of the problem, to concentrate on fixing the problem. Furthermore, there is no way to measure how well an individual programmer may be preserving or degrading the modularity of the system.
Thus, there is a need for systems and methods to measure system modularity that is, modularization quality.