Relational database systems store data in tables organized by columns and rows. The tables are typically linked together by “relationships” that simplify the storage of data and make complex queries against the database more efficient. Structured Query Language (SQL) is a standardized language for creating and operating on relational databases.
In some cases, databases are arranged such that data in each row of a table is associated with a particular time period. For example, a retail store may wish to keep track of employees and their sales performance over a period of time. To do so, they may set up a database table, where each row of the table contains an element “x” (where “x” may hold the name of the employee), and an element “y” (where “y” represents the total number of sales for a particular product over a defined period of time, such as a day, week, month, etc.) and the elements x, y may be associated with a “start date” and an “end date”. This allows a user to know that employee “x”, had sales figures “y” over a defined period of time.
This type of database is generally termed a “temporal database”, and the term “temporal grouping” refers to a process where like elements that share a common or overlapping time-line (or part of a common time-line) are grouped. For example, a user may wish to perform a query on the database to determine all employees working in the store over a particular month, and their accumulated sales figures for the month. This would require the database to collate all employees who worked during the specified time period, as well as aggregating the total sales for each employee over the specified time period. That is, in addition to grouping, aggregate values may also need to be computed over given time intervals. To extend the simple example given above, the “grouping element” or “grouping value” is “x”, the employee name, since all instances of the employee sales must be collated. The value to be aggregated is “y”, namely the total number sales by the employee.
In a temporal table each row is associated with a valid time (VT) period indicating when the row is valid. In a temporal table each row is also associated with a transaction time (TT) period indicating when the row is recorded in the database. The values VT and TT may be more generally referred to as “time period values”. The values may also be referred to by a skilled addressee as an Effective Time period or an Observation Time period. A sequenced temporal join (join between two temporal tables) requires that two rows can be joined only when the time period values of these two rows overlap. The time period values can be any time period value with a semantic meaning. Typically ValidTime and TransactionTime are used as time period value with a semantic meaning. There may be other such time period values which may also be used, depending on the specific database and the information/data values contained therein.
In an outer join, the rows of the outer table that do not have a matching row in the inner table are also produced in the result with NULL values for the inner table column.
In a temporal outer join an outer table row may only be partially covered by the inner table rows. In other words, there may be one or more portions of valid time of the outer table row that have no matching row for the portions in the inner table. The result must contain rows with the projected columns of the outer table followed by NULL values for the projected columns of the inner table for such remainder or “left over” portions of valid time. Temporal outer join requires additional processing to identify the remainder portions of each outer table row. This needs to be done efficiently in order to optimize the join.
In many prior art solutions, a temporal outer join is performed by first expanding the outer and inner table rows based on their valid time periods. The outer join is then performed on the expanded tables at each time granule as an intermediate result. That is, each outer and inner table row is expanded such that there is one row for each time granule. The intermediate result is then collapsed to give the final result. This is a computationally expensive process.