Day by day, the need for efficient data management is increasing manifold, more so, with proliferation of data and data dependency in an enterprise set up. Efficient data management is a prerequisite for storage and retrieval of accurate data. Various database management systems such as relational database management systems (RDBMS) are typically employed in enterprises for storing, organizing and accessing data in databases. Also, currently, various testing tools are employed as part of data management for checking accuracy of stored and retrieved data associated with various applications.
For instance, web applications (also referred as application(s)) are extensively used as a communications means as well as for performing various activities in an enterprise. These web applications rely on databases that store data associated with these applications, and are in turn managed by one or more database management systems. Typically, an application has rules defined at different layers of the application architecture, namely, data layer, application layer and the user interface layer for storage and access of information stored in the database. The data layer provides for information related to the manner in which data is organized in the databases. If the databases are managed using RDBMS, information related to database tables in which data is organized, logic used for navigating the database etc. are defined in the data layer. As mentioned above, data layer also provides for rules associated with storage and access of data in the database tables.
However, most of the applications do not have all the rules defined at the data layer. In other words, relationship between data stored in the tables may not be defined at the data layer. Conventional test data management tools rely on underlying database relationships for performing various activities as part of application testing, such as data sub-setting, data masking, data archiving and data generation etc. In the absence of rules defined in the data layer, test data management tools may not be able to identify exact relationship between data/tables that the application uses for various business transactions. In such a scenario, test data management experts need to obtain this information from an application subject matter expert, or data architect, but, they may lack in knowledge of such information. Further, in most cases these applications are developed and maintained over a period of many years with multiple developers modifying or maintaining the changes. A lot of information on those changes are usually tacit and are lost with change in personnel over the years. Even though, some conventional tools may rely on data or meta-data patterns to identify logical relationships between data/tables, limitations have been observed in identifying exact relationship that the application uses as part of business transaction. This results in lack of complete information of data related to business transactions, which in turn results in reduced accuracy while performing test data management activities. Moreover, identifying rules at data layer for analyzing relationships between tables manually using the tools and taking support from subject matter experts in the absence of such rules is a cumbersome process.
Furthermore, there are tools that are used for reverse engineering system and process flows, business rules and data lineage in order to identify relationship between various data used in business transactions. These tools work on the application code or log files and figure out relationships between various business entities. If the application code is not available or encrypted as is generally the case with third party vendor products, the tool might not be able to provide accurate information. Also, these tools work at the application level and as such the output can sometimes be complex and incomprehensible.
In light of the above drawbacks, there is a need for a method and system for identifying correct relations between data/tables in a database for accurate data storage or retrieval activities during a business transaction. There is also a need for a method and system that provides for an automated manner of identifying and analyzing relationships between data/tables stored in databases that are not traceable by accessing the data layer. Also, there is a need for a method and system for automating identification of hidden relationships in application database 110s. Further, there is a need for a method and system that works at the transaction level instead of the application level such that interpretation of the output is easy and efficient.