The modern software systems/applications use massive data persistently in the source code. The persistent use of the massive data leads to a complex flow of data. The data may flow from the database, screens, or through a plurality of software processes in the source code. The data may also undergo transformation at multiple points in the source code, giving rise to additional data. The data may be further stored in a database or displayed to the user after undergoing the transformation.
In order to analyze and understand provenance of the data it is necessary to understand the flow of data in the source code. The flow of data may be analyzed by examining transformation or modification in the data at every step of the source code. The conventional methods fail to provide detailed information for the analysis of the flow of data in the source code. Further, the conventional methods may be prone to substantial errors and may become unmanageable due to numerous control flow paths and data flow paths in the source code.