1. Technical Field
The description generally relates to stream processing and, more particularly, to systems and methods for optimal component composition in a stream processing system.
2. Description of the Related Art
Emerging applications such as trade surveillance for security fraud, network traffic monitoring for intrusion detection, sensor data analysis, audio/video surveillance, and value-added voice-over-IP services, call for sophisticated real-time processing on data streams. In these applications, data streams are continuously pushed to stream processing servers, where they are processed by self-contained stream processing elements called “components”. Each component provides an atomic stream processing function such as filtering, aggregation, and correlation. Since stream applications are inherently distributed, stream processing should operate in a distributed fashion. Moreover, distributed stream processing systems provide better scalability and availability for resource-intensive and quality-sensitive stream processing applications. Thus, a challenging problem is to optimally compose distributed stream processing components into dynamically required stream processing applications.
Component composition has been studied under different contexts, such as service composition and systems software composition. The work on service composition is described, e.g., in the following articles, all of which are included by reference herein in their entireties: Raman et al., “Load Balancing and Stability Issues in Algorithms for Service Composition”, Proc. of IEEE INFOCOM 2003, San Francisco, Calif., pp. 1477-1487, April 2003; Gu et al., “QoS-Assured Service Composition in Managed Service Overlay Networks”, Proc. of IEEE 23rd International Conference on Distributed Computing Systems (ICDCS 2003), Providence, R.I., 194-201, May 2003; and Gu et al., “SpiderNet: An Integrated Peer-to-Peer Service Composition Framework”, Proc. of IEEE International Symposium on High-Performance Distributed Computing (HPDC 2004), Honolulu, Hi., 110-119, June 2004. The work on systems software composition is described, e.g., in the following article which is included by reference herein in its entirety: Kohler et al., “The Click Modular Router”, ACM Transactions on Computer Systems, 18(3), pp. 263-297, August 2000. Disadvantageously, the previous work falls short in addressing the optimization requirements in component composition, which is especially important for stream processing systems.
Previous work on stream processing has addressed problems such as load shedding and load migration. Load shedding is described, e.g., in the following article which is incorporated by reference herein in its entirety: Tatbul et al., “Load Shedding in a Data Stream Manager”, Proc. of the 29th International Conference on Very Large Data Bases (VLDB'03), Berlin, Germany, 309-320, September 2003. Further, load migration is described, e.g., in the following article which is incorporated by reference herein in its entirety: Balazinska et al., “Contract-based Load Management in Federated Distributed Systems”, Proc. of 1st Symposium on Networked Systems Design and Implementation (NSDI), San Francisco, Calif., 197-210, March 2004. Disadvantageously, the previous work does not address the optimal component composition problem.
Given the current state of the prior art, it would be beneficial and highly advantageous to have a system and method for optimal component composition in distributed stream processing environments.