Software codes are growing day by day in size and complexity. Static program analysis based tools and techniques are being increasingly used for various purposes such as defect detection in code, code reviews, code re-engineering, code reverse engineering, quality assurance of code and software program understanding.
Today, static code analysis based tools are being widely applied to detect defects much earlier in the Software Development Life Cycle (SDLC). However, scalability has always been the bottleneck for Static Code Analysis Tools.
Lots of efforts have been made to develop static program analysis based tools, but all such tools available till today can analyze only a limited size of code. In practice, a software system may consist of an extra ordinary large code base; to which static program analysis based tools may not be scalable. Many real world systems have code length beyond 5-6 millions Lines of Code (LOC). No such static program analysis based tools can be scaled up to this size of code for analyzing the whole code base as a single cluster. Analysis of such large code base as a whole with existing resource constraints, such as memory and time, in real world is a challenge.
In order to analyze a large code base, with improved precision and scalability it is desirable to have a single analyzable cluster irrespective of the code length. Since the real world software systems are manifold in size compared to the size which can be handled by existing technologies, there is a need to address the inadequacy of the traditional code analysis tools for analyzing a large code base as a single cluster.
However, the existing approach to address this problem of analyzing a large code base is to scale up the system for analyzing the code base that results into poor precision and additional burden on the computing resources. Thus, the existing method and systems are not capable of analyzing the large code base due to code length and insufficiency to scale up to the bigger size of code to analyze the whole code base as a single cluster.
It is observed that the prior art remarkably fails to disclose an efficient method and system for analyzing a large code base with improved precision and scalability of a single analyzable cluster irrespective of the code length. The existing solutions generally are not capable of analyzing the large code base due to code length and insufficiency to scale up to the bigger size of code to analyze the whole code base as a single cluster.