The present disclosure relates to computing and data processing, and in particular, to computer implemented systems and methods for automatic generation of data transformations.
Historically, business users analyzed their businesses using individual spreadsheets of data. As organizational complexity increased, hundreds or even thousands of such spreadsheets may be generated across an organization containing data for a wide range of activities such as finance and accounting, manufacturing, and sales, etc.
Eventually, transactional databases and other forms of data stores were used to capture high speed transactional and operational data, such as ticket sales, parts inventories, and the like. Typically, a transactional database operates on logical units of work (“transactions”) that contain one or more SQL statements, for example, which may read, write, or update data.
To gain access to the above spreadsheets and transactional data for analysis purposes, the data had to be moved from the spreadsheets and transactional databases to an analytic database such as a data warehouse or data mart, where business users could generate user specific queries to derive meaning and business intelligence from the data.
However, data warehousing is problematic because multiple different users are required to move the data from the transactional databases to analytic databases, such as data warehouses, which may have a significantly different storage structure for storing the data. For example, if a business analyst desires particular data that is not available in an analytic database, the business analyst may typically submit a request to the IT department to move desired data into the data warehouse. IT users with special software training may use complex extraction, transformation, and loading tools to obtain the data that meets the business analyst's needs, a process which can be burdensome, time consuming, and may require multiple iterations. Additionally, important data stored in spreadsheets may be spread across an enterprise making it difficult to track, access, and load by an IT organization to meet the needs of different business users.
Accordingly, existing techniques for making transactional data available to business users are often inadequate and inefficient.