Embodiments of the present invention relate to data production and consumption and, more specifically, to synchronizing a cursor based on consumer and producer throughputs.
In a data streaming environment, each producer acts as a source of data to be processed, while each consumer processes the data provided by producers. In some cases, there may be multiple producers and multiple consumers. Each producer or consumer can have a unique data input or output rate, and there will be instances where a consumer becomes overrun with data due to differences among these rates.
In this case, a decision must be made. The consumer will restart processing data either with the oldest available data or with the newest available data. With the first option, the system of production and consumption is placing a premium on the oldest data, risking that the consumer may again become overrun with input. With the second option, the system is placing a premium on the newest data, dropping all historical information in the window between when the consumer was overrun and the current time.