Various embodiments of this disclosure relate to data analytics and, more particularly, to brokering data retrieval from a large set of data sources according to client requirements.
The analysis of Big Data, i.e., large and complex data sets, can provide insights that impact business, stock investments, national security, and many other areas. In some cases, Big Data analysis can affect a business's bottom line and determine that business's fate within its industry. Because Big Data can be unwieldy, making the important decision about which analytics engines to use can be a challenging task. The analytics engines used can affect the cost of a query, accuracy of the query result, and responsiveness in answer to the query.
The International Data Corporation (IDC) estimated that 1.8 zettabytes of data would be created in 2011, and this annual amount of data grows exponentially. Examining every piece of data and using every analytics engine, is impossible in some cases and inefficient in others. Thus, generally only a small subset of available data is used to make decisions. In some cases, certain algorithms or data sources for answering specialized queries are available only from certain analytics engines, and some data sources may be better for some queries than for others. It is therefore important to select appropriate analytics engines, dependent on the queries at hand. These considerations, along with the amount of data, present a significant barrier to providing effective data analytics.