A query is a computer program for retrieving particular items of electronically stored data. Like any other programming task, writing queries is error-prone, and it is helpful if the programming language in which queries are expressed gives assistance in identifying errors while queries are being written, and before they are executed on a relational data source. Programming languages often provide “types” for that purpose, indicating for each variable what kind of value it may hold. A programming language for expressing queries is usually called a “query language”. The most popular example of a query language is the Structured Query Language (SQL).
In SQL, and in most other conventional query languages, types are assigned to each variable separately. As a consequence, only some categories of errors are caught before the query is executed on a relational data source. The only kind of error found is when an operation does not make sense: for instance, a string cannot be subtracted from an integer. In particular, one cannot predict accurately (without running the query) whether a query will return any results or not, and yet, a query that does not return any results is the most common symptom of a programming error.
In the logic programming community, type checkers that detect queries where there are no results regardless of the contents of the relational data source being queried have been constructed. However, most conventional type checkers do not precisely track the dependencies between variables. Also, in the theoretical database community, there has been some work on proving containment between queries, but this is typically restricted to small fragments of the query language which are of theoretical interest only. Furthermore these works do not take advantage of the type hierarchies that typically exist on data stored in a database.