The present application relates to report generation systems and more particularly to techniques for creating reports using cached data.
Report generation systems are commonly used to create reports from data. For example, report generation systems are commonly used to create reports from data stored in data stores such as data warehouses, which are commonly used to store historical enterprise data for historical and reporting purposes.
Reports may include a large amount of data that is derived from one or more data sources. For example, creation of a report may require gathering of data from extremely large data sets and/or data derived from data sets using one or more complex calculations. Vast amounts of data often have to be analyzed and amalgamated for reporting purposes. As a result, processing required for creating a report may consume a lot of system resources.
For example, in an on-demand reporting system where reports are created on demand upon receiving a report generation request, a great deal of processing may be required each time a report is to be created. In a typical conventional on-demand report generation system, upon receiving a request to create a report, processing is performed to determine the data to be used for creating the report. This processing may involve identifying data from the data stores to be used for the report and also deriving data to be used for the report by performing calculations using the stored data. This processing is repeated each time that a report generation request is received, irrespective of whether or not the underlying data that is used for the report has changed or not since a previous report generation. Accordingly, on-demand reporting typically involves unnecessary processing leading to wastage of computing resources.
Some conventional report generation systems have attempted to address the problems of on-demand report generation systems by providing scheduled generation of the data used to create a report. At predetermined intervals, the data for the report is gathered from one or more data sources and stored in a data repository, such as a database. Users requesting a report are provided a copy of the report created from data that is stored in the data repository. However, because the data used to create the report is not generated at the time that the request to create the report is received, the data upon which the report is based may become stale. For example, if the data for a report is gathered from multiple data sources and the information provided by one of the data sources has changed since the data for the report has last been generated, reports created from the data will include stale data. One way to overcome this problem is to schedule the generation of the data for the report at frequent intervals, so that the data generated for the report is less likely to become stale. However, if the data is scheduled to be gathered from the set of data sources too often, substantial processing overhead may be spent regenerating data used to create the report regardless of whether the underlying data from the data sources has changed since the data for the report was last generated.
Accordingly, techniques for efficiently creating reports using cached data is desired.