Sequence functions are functions that operate on ordered sets of rows and that require knowledge of or access to past values. An example of a sequence function is the running maximum function, which returns the maximum value up to the current point in an ordered sequence. Sequence functions are similar to scalar functions in that a single output value is produced for each input value. Sequence functions are different from scalar functions in that the result of a sequence function may depend on some or all of the previous values processed and the processing order.
Standard SQL (structured query language) does not provide for the direct expression or efficient computation of sequence functions. Computing sequence functions is very distinct from grouping rows or joining rows from different tables to produce a result. In the case of grouping rows, the source rows are combined with an aggregate function into a set of result rows, where each source row participates in at most a single result row. This amounts to a partitioning of the input rows into groups and then applying an aggregate to reduce each group to a single result row. In the case of joining tables, data for a result row depends on data from multiple source tables. Each source row may appear zero or more times in the result, depending on the join.
Standard SQL requires the user to construct an N-way self-join in order to compute a sequence function depending on N different rows. For example, consider a table with one row per day that records the day number and the low and high temperatures for the day. The change in low temperatures from day to day could be computed using a join query like the following:
SELECT (T1.LOWTEMP-T2.LOWTEMP) PA1 FROM TEMP T1, TEMP T2 PA1 WHERE T1.DAY=T2.DAY-1 PA1 ORDER BY DAY; PA1 SELECT (T1.LOWTEMP-(SELECT T2.LOWTEMP FROM TEMP T2 PA1 WHERE T1.DAY=T2.DAY-1)) PA1 FROM TEMP T1 ORDER BY DAY;
or
This approach has several drawbacks. First, the number of joined tables increases with the number of rows referred to in the scalar expression, making the syntax neither manageable nor intuitive. Furthermore, there is an assumption that the value of DAY always increments by one in the sequence; that is, there is no generic notion of "previous." Moreover, the execution performance of such queries is likely to be very poor because of the multiple joins.
In addition to the basic sequence functions mentioned above, various running and moving sequence functions would be useful to users of database systems. A running function is one that is applied to all rows starting with a defined beginning row and continuing through the current row. For each successive row, the running function is updated to include data from all prior rows and the current row. A moving function is one that is applied to a moving range or "window" of rows. Generally, each time a moving function is evaluated for a new row, one row is dropped from the back end of the window and the current row is added to the front end of the window. For instance, such functions would include functions for determining the running minimum, maximum, sum, average or variance of field in a table, or the moving sum, average or variance of a moving window of rows.