1. Technical Field
This disclosure generally relates to computer systems, and more specifically relates to database systems.
2. Background Art
Database systems have been developed that allow a computer to store a large amount of information in a way that allows a user to search for and retrieve specific information in the database. For example, an insurance company may have a database that includes all of its policy holders and their current account information, including payment history, premium amount, policy number, policy type, exclusions to coverage, etc. A database system allows the insurance company to retrieve the account information for a single policy holder among the thousands and perhaps millions of policy holders in its database. Retrieval of information from a database is typically done using queries. A database query typically includes one or more predicate expressions interconnected with logical operators.
Structured Query Language (SQL) provides a way to write queries to a database. SQL supports user-defined functions. A user-defined function (UDF) in SQL gives the programmer the capability to provide an encapsulated reusable cell in the SQL language. User defined functions can perform any service the programmer wants to implement. The services in a UDF may be very simple or may be quite complex.
A UDF in SQL can be deterministic or non-deterministic. A deterministic UDF is a UDF that is predictable, meaning a set of inputs to the UDF will always produce the same result. A non-deterministic UDF is one that does not always produce the same result given a set of inputs. The disclosure and claims herein deal with deterministic UDFs. Unless otherwise noted herein, the term “UDF” as used herein means a deterministic UDF.
Because a deterministic UDF always returns the same result given a particular set of inputs, a database optimizer may cache the results of executing one or more portions of a UDF. By caching the results, the results may be retrieved and reused later for the same query that has the same inputs. One way that has been used in the art to cache UDF results uses a hash table. In the known implementations of a deterministic UDF, there is one hash table assigned for use with the UDF, which is good for the life of the query. Let's assume for the sake of illustration a UDF invokes function XYZ three times on three individual columns. If the cardinality of the three columns is a disjoint set, meaning they have no values in common, using one hash table for this UDF will provide terrible performance. Without a way to process deterministic UDFs in a more optimized way, processing deterministic UDFs will continue to be inefficient in known database systems.