This invention generally relates to spreadsheet programs, and more specifically, to spreadsheet programs for stream processing. Embodiments of the invention enable a spreadsheet to use stream partitioning and time- or count-based windows for stream computing.
Continuous data streams arise in many different domains: finance, health care, telecommunications, and transportation, among others. Stream processing is a programming paradigm that allows the analysis and aggregation of these data streams as they are being produced. This is very useful since these data streams represent such a high volume of data that it is prohibitively expensive to persist on disk.
In organizations that require stream processing, domain experts may have limited programming experience to directly implement their desired solutions. As a result, the domain experts rely on developers for the actual implementation.
Spreadsheets are familiar end-user programming tools. Spreadsheets can be used for programming streaming computations that consume continuous input streams and produce continuous output streams of data.
Partitions and time windows are two-well known stream processing abstractions but they are not suitable for a spreadsheet-based programming because it is not obvious how to represent a variable or dynamically changing number of partitions (e.g., columns) or a varying window size (e.g., rows) in a spreadsheet which is usually used for editing a fixed number of columns and rows. The variability and dynamism of the abstractions can make working with a spreadsheet intractable as columns and rows that are populated change rapidly with real time data making it difficult for a person to do meaningful computation in the spreadsheet.