Data received from one or multiple data sources can be communicated as a stream. In some applications, it may be desirable to process the data stream in real-time. Processing data in real-time can involve processing the data on-the-fly without first storing the data into a data repository.
In scenarios where relatively large amounts of data are to be processed, a distributed system having multiple processing nodes can be provided to perform processing of different portions of a data stream in parallel.