Stream processing systems analyze various incoming streams to determine dependencies among the streams. For example, analytic modules may process multiple streams to detect common patterns, interdependent events, content generated by common sources or related users, and the like. One way of testing these systems is to transmit test streams with known parameters to the stream processing system. Therefore, stream generation is employed for performance characterization, testing, and benchmarking of stream processing systems dealing with processing, forwarding, storing and/or analysis of stream traffic. Stream generation typically aims to simulate or emulate streams generated by different types of applications, protocols and activities. For example, the activities might include email, chat, web browsing, message boards, newsgroups, cellular activity, and the like. Different approaches have been used for generating the streams, such as model driven simulations and client-server architectures.
Examples of currently available stream generation tools include commercial products such as LoadRunner, Netpressure, Http-Load, and MegaSIP; and academic prototypes such as SURGE, Wagon, Httperf, Harpoon, NetProbe, D-ITG, MGEN, and LARIAT.
The existing stream generation approaches focus primarily on matching predetermined volumetric and timing properties, and ignore statistical properties at the content level, such as content and contextual semantics. Most of the existing approaches for stream generation are application specific or lack scalability and/or modularity. Another problem with current stream generating systems is that they are domain/protocol specific. For example, current stream generating systems generate a single type of stream, e.g. web requests. Multiple streams can be generated but they are uncorrelated streams with little or no content richness. Current stream generating systems are not suitable for testing and benchmarking stream processing systems that make intelligent decisions based on analysis of content in correlated streams.
Therefore a need exists to overcome the problems with the prior art as discussed above.