Sequence functions are functions that operate on ordered sets of rows and that require knowledge of or access to past values. An example of a sequence function is the running maximum function, which returns the maximum value up to the current point in an ordered sequence. Sequence functions are similar to scalar functions in that a single output value is produced for each input value. Sequence functions are different from scalar functions in that the result of a sequence function may depend on some or all of the previous values processed and the processing order.
Standard SQL (structured query language) does not provide for the direct expression or efficient computation of sequence functions. Computing sequence functions is very distinct from grouping rows or joining rows from different tables to produce a result. In the case of grouping rows, the source rows are combined with an aggregate function into a set of result rows, where each source row participates in at most a single result row. This amounts to a partitioning of the input rows into groups and then applying an aggregate to reduce each group to a single result row. In the case of joining tables, data for a result row depends on data from multiple source tables. Each source row may appear zero or more times in the result, depending on the join.
Standard SQL requires the user to construct an N-way self-join in order to compute a sequence function depending on N different rows. For example, consider a table with one row per day that records the day number and the low and high temperatures for the day. The change in low temperatures from day to day could be computed using a join query like the following:
SELECT (T1.LOWTEMP-2.LOWTEMP) PA1 FROM TEMP T1, TEMP T2 PA1 WHERE T1.DAY=T2.DAY-1 PA1 ORDER BY DAY; PA1 SELECT (T1.LOWTEMP-(SELECT T2.LOWTEMP FROM TEMP T2 PA1 WHERE T1.DAY=T2.DAY-1)) PA1 FROM TEMP T1 ORDER BY DAY;
or
This approach has several drawbacks. First, the number of joined tables increases with the number of rows referred to in the scalar expression, making the syntax neither manageable nor intuitive. Furthermore, there is an assumption that the value of DAY always increments by one in the sequence; that is, there is no generic notion of "previous." Moreover, the execution performance of such queries is likely to be very poor because of the multiple joins.
In addition to the basic sequence functions mentioned above, it would be useful to have a sequence function that automatically determines the distance, in rows, between the current row and the most recent row in which a search condition was true. In this sense, this would be a sort of inverse operation to the sequence functions discussed above, returning the effective offset of the nearest previous row that satisfies a specified search condition.