The present invention relates to data servers or data base managers which retrieve data from a mass storage device at a high rate of speed and provide the data to a host computer.
Data base management refers to the storage and retrieval of data in a manner transparent to a host computer user. Such management systems are typically done in software. The host computer which uses a data base manager can be anything from a mainframe to a personal computer in a network.
Personal computers in a network are being used to perform functions previously done by a mainframe, and have access to large databases due to the decreasing costs of memory. At the same time, such personal computers typically have limited input/output (I/O) bandwidth, which means that large amounts of data require a significant amount of time to be loaded into the personal computer. The use of networks with personal computers increases the performance problem since all the data is funneled through the same link. This network bandwidth limitation causes contention problems among the users. Accordingly, it is desirable to have some sort of data base management which retrieves the data at high speed and delivers only relevant data to the host computer.
The major types of data base management systems are (1) hierarchal, (2) network, and (3) relational. The first two types of data base management systems require predefined indexing of the data base so that the data desired can be determined from the index without the need for examining all elements of data. Such indexing systems are required because of the limited speed with which data can be retrieved and examined. For a relational data base, all of the data must be examined.
The "relational data base" concept starts with a logical data model. This model depicts the data in a manner which is consistent with the way we want to view data. The model makes the logical view independent of the physical storage environment. The data structure is simply a table with rows of data (records) and columns which define the domains or fields within the rows, and no predefined indices are necessary. Since no predefined indices are used, the relationships among data items desired by a user do not have to be anticipated when the data base is assembled. Instead, a user can form the relationships desired at the time a query of the data base is made. The following example table (T1) of data shows this relational model:
______________________________________ Inventory Unit Monthly Part Number Quantity Cost Usage ______________________________________ 3405-0001 12 6.40 42 3406-0001 300 1.20 200 3406-0002 122 .43 200 3406-0004 6 9.22 14 ______________________________________
The most widely accepted access language for the relational model data base is SQL (Structured Query Language), which is used here as an example. It has many optional clauses which give it power with the relational data model. The most simple form is:
______________________________________ SELECT &lt;list of columns&gt; FROM &lt;list of tables&gt; WHERE &lt;information criteria (boolean expressions)&gt; ______________________________________
Using this statement only the rows in the table which meet the "where" criteria in the "select" columns are collected (returned to the requestor). For example, using the table, T1, above:
______________________________________ SELECT &lt;part numbers&gt; FROM &lt;T1&gt; WHERE &lt;inventory quantity is less than monthly use&gt; ______________________________________
The data returned would be:
3405-0001 PA1 3406-0002 PA1 3406-0004
The inherent problem with relational data base access is speed. Since there are no linkage or index paths through the data all records (rows) must be examined to see if the select criteria is met.
The relational data base method, like other data base methods, is administered by a data base manager (or data server) which interfaces between a host computer user's program and the stored data. The host computer program requests specific data from the data base manager. (I.e. the data items from the data base which correspond to the given relationship). The data base manager receives the high level request and prepares the data to satisfy the request.
A software data base manager which executes in the host processor retrieves data from a hard disk and puts the data into a random access memory for use by the user's program. The user program tells the data base program what data fields are needed, and the data base program retrieves these fields from the disk and loads them into the random access memory for use by the user program using the host processor at its standard, slower speed.
Some data base manager facilities utilize a separate independent processor. Early versions of this type system were standard processors running standard software for data retrieval. In these systems the data base management function is split between the host processor and the data server. The data server sent all data to the host and the host program provided the data base access algorithms to select the desired data.
Later versions of the data base managers utilizing separate processors have moved the data base access algorithms into the separate processor. In these systems the data base processor is a processor just like the host processor such that the same software data base management program can be executed in both. In this technique data is retrieved from a disk at high speed and stored in random access memory. The data in the random access memory is then examined by the processor of the data server to determine the appropriate portions to send on to the host computer, via the network link.
Because of the time required for storing the data in random access memory, doing the comparison, and restoring the desired data, indexing is beneficial to all of these data base management techniques.
Typically, a general purpose computer is used for a data server. Such a general purpose microprocessor will have a data and address bus coupled to an external memory for both storing data and storing a program which will run the microprocessor. The same bus is used for both fetching instructions from memory and for doing operations dictated by that instruction which require data or addresses to travel over a bus. Accordingly, some form of bus arbitration is needed.
The processing of data is slowed by the time required to fetch and decode instructions. In order to speed the instruction fetch time, a next instruction is typically fetched before the current instruction is decoded to give an instruction "pipeline", with the next instruction entering the pipe before the current instruction leaves the pipe. This will speed operation in the usual case where the next instruction fetched is determined by simply incrementing a program counter, but will be a wasted fetch where the decoded current instruction contains a jump to another location for the next instruction.
A jump is typically done by loading a jump (or branch) address into a program counter in response to a jump instruction. The next instruction is stored in a push-down stack of registers so that it can be saved upon a return from the subroutine which the program is jumping to. In addition, the push-down stack can be used to store any constants which are in general purpose registers used by the processor, since these constants may be lost by the use of the registers by the subroutine.