In typical database systems, users submit commands to a database server using a database application. The commands submitted to the database server allow the user to store, update, and retrieve information. To be correctly processed, the commands must comply with the database language that is supported by the database server. One popular database language is known as the Structured Query Language (SQL). As commands are executed by the database server, logical units of work are created. A logical unit of work that is comprised of one or more database language statements is referred to as a transaction.
In networked environments, a database server often performs query processing for queries submitted by remotely located client stations. The database server processes each query and generates a query result that satisfies the criteria defined by a particular query. The query result must subsequently be transferred to the client station from which the query originated.
In certain instances, the query submitted by a client station involves the joining of two or more relatively large tables. Other times, the query is designed to retrieve relatively redundant tables, in other words, a table that contains a significant amount of data repeated across multiple rows. During the transmission of the results of such queries, much of the network transmission time is used to support transfer of redundant data. This problem becomes increasingly costly as the size of the query result increases and/or the amount of redundant data increases.
FIG. 6 illustrates an example of a join operation between two relatively wide tables and the resulting table, according to conventional methods. A wide table is a table that contains a large number of columns per row. In FIG. 6, reference numerals 610 and 612 represent a first and second wide table, namely table A and table B, respectively. As shown in FIG. 6, table A 610 contains k rows and n columns, where n and k are potentially large integers. Table B 612 contains m columns, where m is a potentially large integer. For simplicity, table B 612 is selected such that it contains the same number of rows as table A 610. An unrestricted join of table A 610 with table B 612 results in a third table, namely table C, indicated by reference numeral 614. Table C 614 contains k*k rows, that correspond to the number of rows contained in table A 610 multiplied by the number of rows contained in table B 612. Table C 614 contains n+m columns, that correspond to the number of columns contained in table A 610 plus the number of columns contained in table B 612.
A fetch of multiple rows from the resulting join operation, i.e. table C 614, would return information in the form of a matrix having an order: row 1 of table A 610 plus row 1 of table B 612; row 1 of table A 610 plus row 2 of table B 612; . . . ; row 1 of table A 610 plus row k of table B 612, depending on the number of rows required. As shown in FIG. 6, if table C 614 is retrieved in its entirety, the first k rows fetched would have row 1 of table A 610 in common. Since row 1 of table A 610 is common to the first k rows, it should be readily apparent that retransmission of the same data will increase traffic over the network. In particular, when row 1 of table A 610 is sufficiently large (e.g., contains a large number of columns), the amount of time required to transmit redundant information can have a serious effect on system performance.
Based on the foregoing, a disadvantage associated with current methods of executing queries on a remote server is the increased amount of time required to transmit information that result from operations where the tables contain a substantial amount of data that is repeated across multiple rows.