Business Intelligence (“BI”) generally refers to a category of software systems and applications used to improve business enterprise decision-making and governance. These software tools provide techniques for analyzing and leveraging enterprise applications and data, ranging from both historical to current real-time data and business events. They are commonly applied to financial, human resource, marketing, sales, service provision, customer, and supplier analyses. More specifically, Business Intelligence tools can include reporting and analysis tools to analyze, forecast and present information, content delivery infrastructure systems to deliver, store and manage reports and analytics, data warehousing systems to cleanse and consolidate information from disparate sources, integration tools to analyze and generate workflows based on enterprise systems, database management systems to organize, store, retrieve and manage data in databases, such as relational, Online Transaction Processing (“OLTP”) and Online Analytic Processing (“OLAP”) databases, and performance management applications to provide business metrics, dashboards, and scorecards, as well as best-practice analysis techniques for gaining business insights.
Traditional BI tools have supported long-term decision planning by transforming transactional data into summaries about the organization's operations over a period of time. While this information is valuable to decision makers, it remains an after-the-fact analysis with latencies from data arrival to report production. The information needs of operational decision-making cannot be addressed entirely by traditional BI technologies. Effective operational decision-making requires little delay between the occurrence of a business event and its detection or reporting. Just-in-time, finer grained information is necessary to enable decision makers to detect opportunities or problems as they occur. BI technologies are not designed to provide just-in-time analysis.
Business Activity Monitoring (“BAM”) systems and applications are the set of technologies that fill in this gap. BAM applications provide right-time or just-in-time reporting, analysis, and alerting of significant business events, accomplished by gathering data from multiple applications. Right-time differs from real-time analysis. In right-time analysis, the main goal is to signal opportunities or problems within a time frame in which decision making has a significant value. Real-time analysis requires that opportunities or problems be signaled in a pre-specified, very short time-frame, even if the alert has the same decision-making value a day after the occurrence of the events that triggered it. Real-time operation, although preferred, is not essential. The goal is to analyze and signal opportunities or problems as early as possible to allow decision making to occur while the data is fresh and of significance. BAM applications therefore encourage proactive decision making.
Business events, transactional data or messages are modeled in BAM applications as “data streams”. Data streams are time-ordered sequences of data that represent business events detected from multiple applications and sources (both internal such as messaging services and external such as news feeds). In contrast to BI applications, BAM applications are designed to support continuous queries that process data streams without necessarily archiving data.
Continuous queries, also referred to as standing or long-running queries, are queries that evaluate continuously as new data arrive on data streams. Generally, continuous queries define time-based processing windows on data and provide approximations based on data collected in a most recent time window.
From a business perspective, BAM users may need to write continuous queries to monitor current events whenever unexpected opportunities or problems warrant such monitoring. They may also need to modify the continuous queries dynamically and on-the-fly. For example, a continuous query may monitor total sales in the last five minutes. Every five minutes, a BAM user, such as a sale manager, receives the total sales made in the last five minutes. Such a query enables the sales manager to detect trends or patterns in the data (e.g., an increase or decrease in total sales) and make effective decisions if sales are dropping, the sales manager may need to investigate the productivity of the sales employees.
Previous work for formulating and processing continuous queries has focused on the development of continuous query languages and continuous query systems, such as, for example, the Continuous Query Language (“CQL”) supported by the STREAM system developed at Stanford University, Stanford, Calif. (http://infolab.stanford.edu/stream), and the StreaQuel language, supported by the TelegraphCQ system developed at the University of California at Berkeley, Berkeley, Calif. (http://telegraph.berkeley.edu), among others.
These continuous query languages, although enabling users to monitor continuous data streams, are SQL-like languages with additional windowing constructs. As such, they are not suitable for use in BAM systems in which a low query formulation effort is desired. Query formulation effort describes the overall effort of users to create and execute a query (e.g., a continuous query). The total effort required may be defined as the sum of the initial training effort to learn the query language and the repeated efforts to perform productive tasks on it. BAM users, in particular, may not be able to write or learn to write continuous queries directly in a BAM system using the currently-available continuous query languages.
BAM users are typically any employee within the business organization (at any management level) that is required to make decisions based on the analysis of current business events or data. BAM users therefore range from technical to non-technical. Since multi-dimensional data streams are created from events signaled from different applications and data sources, it is unlikely that such users are familiar with the data sources and applications to determine how to properly set up continuous queries to monitor the data. If an end-user needs to spend time to correctly formulate a continuous query or ask assistance from support staff to do so, the benefit of just-in-time data processing provided by a BAM system is lost.
With BAM users ranging from the non-technical to the technical, their capabilities to formulate queries typically depends on four factors, namely: (1) their familiarity with programming (or GUI) concepts; (2) their frequency of system usage; (3) their application knowledge (e.g., the precision of the users' conceptual model about the structure and contents of the data stream sources, their data schema, and queries); and (4) their range of operations (the different kinds of queries a user requires such as aggregation, summarization, or monitoring queries).
FIG. 1 illustrates different user types according to these four factors. Typical BAM users are novice/skilled, casual/managerial users. Such users can benefit from a BAM system with a low query formulation effort. Casual users 105, for example, have simple, straightforward information requests and do not typically require the full power of a query language. Managerial users 110, however, have complicated information requests such as summarizing data or trend analysis but cannot afford a high training effort or a high query formulation effort.
Previous work for assisting users with query formulation and reducing their training and repeated efforts has included the development of query interfaces, such as query-by-example and query-by-forms. Query-by-example is a graphical query interface for use with relational databases. It was designed to require users to know very little in order to get started and to reduce the knowledge that the user subsequently has to learn in order to understand and use the whole language. Users are exposed to the fields and structures of relational tables within the databases and formulate queries by filling in a table with examples of the data they would like to retrieve. An example of a query-by-example is illustrated in FIG. 2. Query-by-example 200 filters all sales with a value greater than a thousand dollars. Table 205 illustrates the fields in the sales schema the user is interested in and table 210 illustrates a manipulation of the fields in table 200.
Form-based interfaces, also known as query-by-forms, are a natural extension of query-by-example. Form interfaces have inherent advantages over query-by-example because (1) users are familiar with forms (data is collected, stored, retrieved and updated in terms of common business forms, which are widely used and understood, e.g., invoices, receipts, checks, etc.), and (2) forms capture tasks—they represent the data from the user's perspective (rather than from the database perspective). An exemplary form interface for finding all sales with a value greater than a thousand dollars is illustrated in FIG. 3. With query-by-forms, unlike query-by-example, users are not required to define how price and quantity need to be manipulated to retrieve the value of a sale.
Since query-by-example and query-by-form provide query interfaces for databases, they are only designed for precise querying of static data. They are therefore not suitable for use in BAM systems, where the queries are continuous. Currently-available continuous query languages, however, have no such interfaces. Users are required to spend significant resources in learning the particular languages before they can formulate continuous queries with them.
Currently-available continuous query languages are also not suitable for dynamic modification on-the-fly, as any changes in the continuous queries using such languages may require the queries to be converted into a sequence of operators before the queries can be executed. In a just-in-time operation environment, delays in formulating a continuous query due to, for example, reliance on IT staff to formulate or modify the query as needed, result in BAM users missing right-time information of business opportunities or problems collected by continuous querying of business events.
Accordingly, it would be desirable to provide techniques for improving the usability of a BAM system. In particular, it would be highly desirable to provide techniques to facilitate the formulation of continuous queries in a just-in-time BAM system.