A distributed stream data processing system may be used for purposes of processing “big data.” In this context, “big data” refers to a relatively large volume of data. As examples, big data may be network data, which may be analyzed to identify failures and network intrusions; sensor data, which may be analyzed as part of quality control measures for a manufacturing process; and so forth. The distributed stream data processing system may gather its data from disparate data sources; and as such, the system may perform various extract, transform and load (ETL) operations for purposes of transforming the data from these disparate sources into a common format that may be stored and analyzed.