Large computer systems can gather and analyze data generated by a large number of different sources. Extremely large data sets may be analyzed computationally to reveal patterns, trends, and associations. Such large data sets are often referred to as “big data.” Big data tools can analyze high-volume, high-velocity, and high-variety information assets far better than conventional tools and relational databases that struggle to capture, manage, and process big data within a tolerable elapsed time and at an acceptable total cost of ownership.
Oftentimes it is beneficial to generate reports to summarize relevant data from the database(s). These reports may be defined and then executed to fill the fields of the report with summaries of data. For example, a human resources (HR) director may create a report to see compensation, goals, performance, etc., of employees in an organization. The reports are created through the manipulation of reporting metadata, which represents categories and relationships that define the data and how the data can be structured in reports by the end user.
As the size of the underlying data increases, however, there are a number of technical challenges that are introduced in the manipulation and usage of reporting metadata. Reporting metadata is persisted in a database, which means that any time any parameters related to the reporting data are updated, the metadata has to be regenerated to reflect the changes. If reporting metadata are not persisted, the complexity of the relationships between the data cannot be reflected in the metadata when the data sets grow too large, limiting the solution's usefulness.