The present invention relates to computer-implemented systems and methods for generating a report. More specifically, the invention relates to systems and methods for mapping a hierarchical data source such as an XML file to a virtual flat data source such as a virtual relational database, thereby enabling a user to generate a report via the virtual flat data source.
Business professionals often deal with and require large amounts of data in the form of reports. Such reports may be generated from much larger collections of data stored in business databases. A typical report accesses dozens to thousands of records (or more) and requires a few seconds to many hours to generate. Typically, the records appearing in a report are organized by one or more level breaks after which totals or subtotals of numerical data are provided. In addition, most reports are highly formatted to provide relevant background information and facilitate understanding. A single report may be related to other reports, and a whole group of reports may be used by many people associated with an enterprise, all of whom need to see the same consistent set of information. Examples of such reports include reports containing records of open orders, sales forecasts, customer statements, and balance sheets.
One type of report that is commonly used is the spreadsheet. A spreadsheet is a grid including a plurality of cells in which formulas may be applied to contents of one or more of the cells within the spreadsheet. Specifically, formulas within a spreadsheet typically refer to cells in the spreadsheet by row and column (e.g., A4). Typically, the contents of the cells of a spreadsheet include values obtained from a data source such as a database. Thus, once the values are obtained from the database, the formulas may be applied. Typically, when a spreadsheet is used, the information in the spreadsheet is primarily numeric with additional textual information such as headers and footers.
Many types of data structures and file types are available for representing and storing data for use in generating reports. Generally, files may be categorized as either having a flat or hierarchical file format. A flat file format is a format in which all data is represented on the same level. In other words, a flat file format does not explicitly include hierarchical, parent-child relationships within the data. Moreover, in a flat file format, spatial relationships between data elements are constant throughout the file. In contrast, a hierarchical file format is one in which hierarchical relationships (e.g., parent-child relationships) between the data elements are represented spatially corresponding to the location of data elements within the file. In other words, relationships between data elements are represented hierarchically through the location and relative distances between data elements. For instance, multiple data elements within a file having a hierarchical file format are commonly nested to indicate hierarchical relationships between the data elements.
One example of a flat file format is a relational database. Generally, in a relational database, each file or table is associated with a particular data element. For instance, a customer file or table is associated with the data element “customer.” Each file or table includes a plurality of columns that correspond to a plurality of fields in the customer table. Thus, each row in a customer table corresponds to a particular customer. For instance, exemplary columns in a customer file may include name, address, and phone number. In this manner, information for multiple customers may be stored as multiple rows in a single customer file or table. This relational database format is considered flat since the location of the data fields or elements with respect to one another within the file is irrelevant, and does not denote any additional information with respect to the relationship between the data elements.
One example of a hierarchical file format is an Extensible Markup Language (XML) file. FIG. 1 is a diagram illustrating an exemplary XML file. In this example, the XML file 102 is a customers list 104 in which data for each customer 106 includes a customer identifier 108, last name of the customer contact 110, first name of the customer contact 112, customer name 114, phone number 116, address 118, city 120, state 122, postal code 124, credit rank 126, purchase frequency 128, purchase volume 130, and representative identifier 132. As shown, the hierarchical relationships between data elements are represented by indentations of the data elements within the file. In other words, the data elements are explicitly nested to indicate hierarchical relationships between the data elements.
Such hierarchical file formats provide various advantages. For instance, XML is a well-known standard recommended by the World Wide Web Consortium for sharing information formats and data on the World Wide Web, intranets and elsewhere. Unfortunately, it is generally difficult to query a hierarchical file such as an XML file. In addition, many users prefer to use a flat file format such as a relational database. Moreover, many off-the-shelf tools for querying a flat format such as a relational database are available. Accordingly, it would be beneficial if such tools could be leveraged to enable complex queries to be processed.
In view of the above, it would be beneficial if a user could access data stored in a hierarchical file format via a simpler query to a flat file or database.