This invention relates to systems and techniques for monitoring information events from various data sources, and further relates to systems and techniques for specifying shared event sources, constructing distributed event processing systems, and techniques and systems for managing event processing rules and specifications.
Business and government analysts are experiencing an information onslaught. The volume of information is increasing in internal systems, in publicly accessible information sources, and across the Internet. While this information flow is valuable, it is becoming increasingly difficult to sort and filter, to determine patterns and trends in the information flow, and to present the resulting materials in a form usable by one or more end users. Operational imperatives such as situational awareness, cost containment, risk mitigation, and growth require timely detection of, and response to, key business events. Organizations must be able to effectively respond to opportunities and threats represented by the information continually occurring across disparate sources of information that relate to their missions.
Information flows can be considered as a sequence of individual events, each of which must be collected, processed, evaluated, as part of an information processing system. Individual events, such as a new order, a 911 call, a customer service request, a geographic position update, or an updated investigative report represent the foundation of new and changing data throughout an organization that must be continually monitored to identify opportunities and threats. Previous systems have employed news feeds, databases, and search engines to receive, process, and index data as it becomes available. However, cataloging information into databases and search engines takes time and computing resources. Meanwhile, the user must periodically query the data system for new results that match their search criteria. One manager estimated that his analysts spend the first 1-2 hours of each workday checking existing databases and search engines for new information and manually further processing the information into actionable events. Complex events, that include related individual events, often go undetected due to information overload or improper filtering and assessment. Automated systems that automatically correlate events, particularly events that are separated by a time interval of days or weeks, often fall back on storing information in a database, manually or automatically querying the database for known event patterns, and then re-analyzing the information retrieved. These processes are highly inefficient and are not scalable.
Raw information becomes actionable events through a defined series of information processes and/or processing steps. Existing systems provide for static processing systems using IT-defined, process-driven workflows that often add time to the processing. Mechanisms for access to raw and processed information, the processing steps and workflow definitions, and resulting assessment and actions have historically been neglected. Extant information monitoring systems often rely on monolithic architectures over continuously connected networks in order to process information. These systems suffer from significant drawbacks. First, these systems are often statically configured, requiring users with appropriate permissions to pre-compute the amount of processing power and processing resources that must be made available for each step of system's processing. Thus, these types of systems do not scale well as information loads, resources, staffing, or processing requirements change. Changing of information and data sources, information flow rates, available processing power, the number and locations of staffing seats (including workstation configurations, access, and provisioning with data sources, processing recipes and the like) are all aspects of scaling. Some information resources exhibit “bursty” volume characteristics, in which the information volume often increases by orders of magnitude during times of interest. Similarly, staffing and resources applied to management of events often dramatically increase during times of interest. Often, time is not available to make IT-centric workflow, system provisioning, and/or changes, much less to provide for the testing and rollout of new IT-based workflows and processing recipes. Thus, the management of the information, resources, and staff during times of substantive change is a significant challenge in today's environment. Additionally, the organization and provisioning of resources is often statically performed by external (e.g. IT) staff, and requires significant effort to shift resources and services to use alternate processing resources, workflows, and processing recipes. For mission critical applications, disaster and outage management paradigms further require that large monolithic systems be replicated at great cost.
Furthermore, present system architectures lack effective ways to share and control data about information sources, data management processes and recipes, and information filtering techniques/processing recipes. For example, Microsoft Windows products provide an interface for ODBC drivers in which an information database connection can be loaded onto a user's machine, but which provides no other controls than the connection string. Furthermore, ODBC technologies support only local and network connected databases, and are not useful for newer distribution mechanisms such as web-based information sources and RSS feeds. For example, RSS feed readers store URLs and authorization information to access data feeds, but do not provide continued processing capabilities. Other products, such as web search engines or bookmark collections, provide stored URLs to web-based information services, but do not provide an easy mechanism to manage authorization credentials associated with each URL so that several users can access the information without extensive or repetitive configuration efforts, nor do they permit definition of additional process steps required to obtain and pre-process content referenced by the URLs. Other types of information sources have similar issues, and do not provide a robust, sharable solution for handling these aspects of information processing.
Additionally, extant systems and methods do not provide mechanisms for successful information processing and filtering recipes and techniques to be captured in a transportable manner and subsequently used within a distributed environment. Raw information is obtained, pre-processed, and assessed as part of a process that transforms the information into an actionable event. The mechanisms for obtaining the information, pre-processing it, and assessing the information are traditionally hard coded into information processing systems, and are thus not easily distributable or sharable between users, sites in a distributed system, or different systems or instances of a system. Current systems generally rely on individuals to obtain, process, and assess information feeds independently. This results in significant duplication of processing efforts at system, machine and personnel levels.
Specifically, the workflows, rules, and processing recipes for processing information are generally hard coded into present day information system, or are configured by specialized IT staff. The information systems often impose restrictions on how, where, and by whom the rules can be changed, and are not structured to support reuse of rules and processing recipes, nor the sharing of user developed rules, recipes, and workflows.
Finally, current information processing systems do not provide for isolation of multiple instances of workflows, rules, and recipes from each other. Extant systems provide large, monolithic processing mechanisms that focus on processing speed and not information separation. Information separation is important in some businesses. Furthermore, systems that do not provide robust information separation can exhibit unexpected information processing behaviors because of interactions between two or more processing rules. The information “side effects” cause system instability, unreliable processing, and sometimes work stoppage.
What is needed is a system that permits for the management of user-defined resources, processing recipes, and subsequent event management workflows within a scalable, distributed framework for connecting to and receiving information from a plurality of resources, processing this information, and producing actionable events which are effective to provide real-time event detection, analysis, and response in highly complex and secure environments.