“Streaming” involves transmitting data over a network as a steady, continuous flow, allowing playback to proceed as new data is received. A wide variety of data may be streamed—e.g., audio and video files, downloadable e-commerce purchases, in-game player activities, information from social networks, and telemetry from connected devices or instrumentation in data centers. Streaming data is beneficial in most scenarios where new, dynamic data is continually generated. For example, event stream processing in applications, such as network monitoring, e-business, health care, financial analysis and security supervision allows enterprises to react to changing business conditions in real time; streaming audio and video, on the other hand, provide real-time entertainment accessibility to a worldwide audience. Accordingly, streaming data has been a principal driving force in the continued development and exploitation of the Internet.
Streaming data may be generated continuously by multiple data sources, which typically transmit units of data (records) in small sizes (e.g., kilobytes) simultaneously. Upon arrival, these data records need to be processed sequentially and incrementally based on their inherent order on a record-by-record basis. However, because each data record may experience a different transmission delay (due to network traffic or possible node failure), the data records may arrive out-of-order at the destination—i.e., a data record transmitted earlier from a data source arrives at the destination later, while a data record transmitted later from the source arrives at the destination earlier. Managing out-of-order data arrival records is handled by network protocols such as TCP (transmission control protocol), which work well when all the data originates from a single source. Multi-threaded environments, in which related records originate with different sources or records originating with a single source are destined for different applications, pose a much greater challenge, particularly as the number of data sources or applications increases. Furthermore, protocols such as TCP operate at the packet level. If the transmission is organized at a different level (e.g., complete files or subpacket units such as pixels), such protocols cannot be used.
Accordingly, there is a need for an approach that can handle out-of-order data arrival and provide high scalability to accommodate large numbers of data resources in multi-threaded environments.