Computer systems generally include one or more processors interfaced to a temporary data storage device such as a memory device and one or more persistent data storage devices such as disk drives. Each disk drive generally has an associated disk controller. Data is transferred from the disk drives to the disk controller and is then transferred to the memory device over a communications bus or similar. Once data has been transferred from the disk drives to a memory device accessible by a processor, specific application software is then able to examine the data.
The application software used will depend on the application to which the data relates. If the data is required primarily for statistical analysis it is common to use the SAS programming language. On the other hand if the data is stored in a relational database then queries are often made of the data in SQL, the standard language in relational databases.
SAS is an imperative language that manipulates data sets as tables. SAS includes constructs to specify arithmetic expressions, flow control and procedural calls. SQL on the other hand is a set oriented language that also manipulates data sets as tables, but which allows the specification of relationships among tables with primary and/or foreign keys. Both SQL and SAS languages identify data set columns with names and not with subscripts and automatically scan all rows in a data set without the need of a loop construct.
It would be particularly desirable to create a set of SQL statements that produce the same input as a given SAS program. It would be further desirable to at least partially automate the translation of such SAS programs to SQL programs.