Streaming platforms are useful for their ability to provide data to continuous, real-time applications that are configured to react to, process, or transform data. Streaming platforms receive streams of events or data changes from a variety of data systems, i.e., producers. The streaming platform feeds the events/data streams to other data systems, e.g., consumers, such as relational databases, key-value stores, data clusters, or data warehouses. A streaming platform accordingly centralizes communication between producers of data and consumers of that data. One example of a streaming platform is Apache Kafka™.
Kafka stores data in partitions. Partitioning allows a Kafka user to spread data across multiple servers or disks, i.e., for scalability and/or redundancy purposes. Streaming platform architectures such as Kafka typically guarantee that messages written to a partition by a producer in a specific order or sequence will be read by a consumer in that same order or sequence. However, streaming platform architectures such as Kafka cannot guarantee that data/messages read across multiple partitions by a consumer are ordered in the same order or sequence in which the messages were transmitted by the producers to the streaming platform.