With the costs of disks decreasing and the capacity of disks increasing, it is now possible to create data repositories that can store hundreds, thousands, or even millions of individual files. Additionally, due to the ever-increasing number of programs and file formats, data repositories may often store files of several different formats. Thus, the usage of data repositories to archive data including many files in many formats is increasing.
Additionally, organizations often desire to analyze information stored in data repositories in order to efficiently utilize the data, such as through data mining. For example, it is often desirable to generate a report summarizing data stored in a data repository for utilization in various aspects of business planning. Unfortunately, to access and analyze many files of several different formats stored in a data repository, inefficient methods must be employed that greatly increase in complexity with the amount of data analyzed.
Additionally, even if data stored in a repository may be accessed and analyzed, such use often strains the source of the repository, such as a computing device including a database, as the repository must be repeatedly accessed and queried to analyze the data stored therein. Thus, powerful computing devices must be utilized in order to limit strain on the underlying datasource, thereby increasing the cost and complexity of the data repository.
Due to the desire to limit the cost and complexity of data repositories, static reports are often generated weekly, monthly, quarterly, etc, to limit repository access and eliminate consumption of additional computing resources. However, static reports are often of little use, as static reports must be specific to a predefined data subset, such as the number of sales by a particular vendor on a particular date, that may not be easily regenerated or dynamically modified for data. Thus, separate static reports must be generated for each desired analysis, thereby increasing repository access and datasource strain and decreasing the utility of each static report.