Many people use electronic document preparation systems to help prepare important documents electronically. For example, each year millions of people use tax return preparation systems to help prepare and file their tax returns. Typically, tax return preparation systems receive tax related information from a user and then automatically populate the various fields in electronic versions of government tax forms. Tax return preparation systems represent a potentially flexible, highly accessible, and affordable source of tax return preparation assistance for customers.
The processes that enable the electronic tax return preparation systems to prepare tax returns for users are highly complex and often utilize large amounts of human and computing resources. To reduce the usage of computing and human resources, new tax return preparation processes are continually being developed. Of course, before the new tax return preparation processes can be implemented, they must be thoroughly tested to ensure that they properly calculate data values for tax returns. However, testing the new processes with a very large number of previous tax filers results in a very high use of computing and human resources in the testing process. On the other hand, testing the new processes with a smaller random sample of previous tax filers is often inadequate, as less common tax filer attributes will likely not appear in the sample set. If the new processes are not tested to ensure that the processes can accurately handle tax filers with uncommon attributes, then flaws in the new processes will likely go undetected. This results in the tax return preparation system failing to properly prepare the tax returns for many users.
In addition, lengthy and resource intensive testing processes can lead to delays in releasing an updated version of the electronic tax return preparation system as well as considerable expense. This expense is then passed on to customers of the electronic tax return preparation system. These expenses, delays, and possible inaccuracies often have an adverse impact on traditional electronic tax return preparation systems.
These issues and drawbacks are not limited to electronic document preparation systems. Any data management system that needs to update processes or calculations for data management services can suffer from these drawbacks during testing and development of new data management calculations and processes.
What is needed is a method and system that provides a technical solution to the technical problem of generating sample data sets that are likely to cover many use cases while efficiently using resources.