Field of the Invention
The invention relates to the field of database management systems, and particularly to the field of real-time database systems supporting persistent queries using virtual table structures.
Discussion of the State of the Art
Business reporting or enterprise reporting is a fundamental part of identifying the capabilities and performance metrics within an organization to convert into knowledge to improve efficiency and overall performance of people, systems and processes within the organization. To support better business decision-making, businesses rely on large amounts of information (for example, transactional log files, interaction log files, system configuration information, human resource information, customer transaction data, path analytics, etc.) produced by management systems that provides managers with information about sales, inventories, and other data that would help in managing and improving the enterprise.
With the dramatic expansion of information technology, and a desire for increased competitiveness in corporations, there has been an enormous increase in the capture of large datasets representing every facet of business processing, customer transactions, and other data to understand and improve how the business functions (often referred to as ‘Big Data”). As such, computing power to produce unified reports (for example, those that join different views of the enterprise in one place) has increased exponentially. This reporting process involves querying data sources with different logical models to produce a human readable report. For example, in a customer service communication center environment, a manager may query a human resources database, an employee performance database, a set of transactional logs, and real-time metrics to identify where resources may require improvement and further training. Furthermore, a problem that exists in many cases is that large organizations still have data in legacy systems where moving to a more robust, open-systems architecture is cost prohibitive. In other organizations, systems and data warehouses are developed as functional silos where every new system requires its own database and as a result, data follows a different schema and is often copied from system to system. In other situations, businesses have fundamentally different data sources that must remain separate (for example a communication server system and a customer experience database). As a result, businesses need to consolidate their disparate data while moving it from place to place, from one or more sources, and in different forms or formats. For example, a financial institution might have information on a customer in several departments and each department might have that customer's information listed in a different format. The customer service department might list the customer by name, whereas the accounting department might list the customer by number. In order to use the data to create a report from one or more data sources, the data may need to be bundled and consolidated into a uniform arrangement and stored in a database or data warehouse that may be used as the data source for report creation. The function to consolidate the data is typically handled by an extract, transform, and load (ETL) procedure. Extracting is the process of reading data from one or more source database. Transform is the process of converting the extracted data from its previous form into the form it needs to be in so that it may be placed into a target database where transformation occurs by using rules or lookup tables or by combining the data with other data. Load is the process of writing the data into the target data warehouse. A user would then use a special-purpose query language designed for managing data (for example, structured query language (SQL) known in the art) to identify what data elements are required for a business report.
The problem with systems known in the art is that the ETL process takes time to extract, transform and load the required data from the one or more data sources. The larger the dataset, the longer the process may take. In some installations where large data sets are involved processing ETL may be extremely slow, often taking hours or days. In these cases, costs are increased the ability to provide reports in a real-time or in a timely manner is often not possible. Furthermore reports may be inaccurate as data from the system has not yet been written. For example, in a contact center environment, it is desired to measure performance in 15 minute time increments in order to react to sudden increases or decreases in interaction traffic. Since a contact center is typically made up of many data sources (for example, agent information, customer profile information, transaction information, historical contact information, etc.) from multiple data sources (for example, private branch exchange (PBX) transaction information, routing information, configuration service information, etc.) the reports are typically not available after the 15 minute interval and perhaps not available for many hours afterwards thus rendering the real-time report inaccurate or unusable.
To remedy this situation, various techniques have been tried in the art, for example, a total in-memory database, but for a large application, such as in a high data throughput environment (for example, a large contact center or financial institution), the amount of memory that is required is massive where even modern systems cannot feasibly accommodate the memory requirements, and thus become cost-ineffective.
What is needed is a highly responsive system and methods for providing a real-time database capable of handling persistent queries that are very responsive to data updates and that support persistent and readily updates aggregations of data so that analysis and reporting systems may report in smaller time increments (for example, 15 minute intervals), while allowing for reports to become available very soon, if not immediately when a report is requested without a huge infrastructure.