In general, software applications during their development stages are tested or assessed before they can be deployed. The testing validates that the applications are working in an expected or planned manner. For data-driven assessment of such applications, test data is required. For the data driven assessment to be effective, the test data should have certain desired characteristics, such as syntax, semantics, and statistics, similar to that of actual data, such as production data, which the application would eventually handle or operate on after deployment.
Possible candidates for test data may include production data. As indicated, the production data is the actual data on which the application would operate, and hence may be considered as suitable for the purpose of testing. However, production data may include sensitive information or information privy to individuals associated with it. For example, in case of banking applications, it would not be appropriate to use production data, i.e., client-specific information for testing purposes. In such cases, the production data can be modified by using data masking or data obfuscation techniques which either hide or delete user-specific information, and subsequently replace it with relevant but false data. Other approaches include using synthetic data or dummy data, as a test data for testing the applications. The synthetic data can be generated using various synthetic data generation tools. Using synthetic data for testing eliminates the risk of privacy breach as the data generated is fictitious.