This invention relates to an improvement in information management systems and more particularly to apparatus for optimizing the throughput of such systems that use magnetic disk memory as auxiliary storage for data.
Flexible and very capable computer based data handling and control systems have been devised and put into use at sites all over the world. One such system consists of multiple AN/GYQ-21(V) units of Interactive Analysis Systems designed for automated data handling and control of multisource intelligence information and command/control data. The systems perform five basic functions:
Message control, storage and distribution.
Interactive analyst support and analytical aid.
Multi-channel, multi-level communications.
Special support through computer graphics and display.
Reference document and report preparation support. Each unit is made up of a central processing unit (CPU), a mass data file, programmable communication controllers and real time operational software. Within a site, units are linked together to create a single operational entity, with each unit contributing its designated tasks to the overall system operation. This technique of "distributed processing" permits each unit to compute simultaneously with other units, accessing its own or other unit's data on a priority basis, resulting in a net performance more powerful than most large-scale machines. As workload requirements change, units may be added or removed as required.
At some sites, the units are interconnected to large-scale, host digital computers. In that case, the system acts as a front-end processor, relieving the host computers of considerable work associated with just the tasks interfacing to external systems. In combination with keyboard/display terminals, large screen wall displays and hard copy printers, the system provides automated assistance to the users in the performance of their duties. This is accomplished by a direct interactive dialog between the terminal user and a worldwide network of similar systems and data bases. The user has the ability to rapidly receive, transmit, and display data from similar sites throughout the world. In addition, the system has a variety of methods for communicating with the user in the manipulation, fusion, and exploitation of the data. Through the keyboard and displays the system provides:
Status of current events and pending actions through automatic interrogation of data bases and surveying of message profiles. Data in the form of interterminal messages, standard data formats, free text from historical files and reference documentation, and computer-generated graphics. Procedural aids with display forms completion, computer directed prompting and editing. Operational feedback of user actions with the system, in the form of displayed notices. Alarms to call the user's attention to critical events or messages. Special operational capabilities as teleconferencing, automatic call-back, bulk message transmission to individual addresses or priority broadcast, remote job entry, timesharing, and automatic logging and report generation.
A typical application of such a system involves a heavy communications processing load, large-scale on-line data base operations, and sophisticated support of a large analyst terminal population. Many systems are part of distributed processor configurations involving 3 to 10 systems, each with billion-character on-line data bases. Initially, some of these systems encountered problems due to inefficient use of hardware and software resources and other design problems, which lead to premature system saturation and lack of responsiveness. With more imaginative and enlightened use of system resources, however, it has been demonstrated that these problems can be overcome. These experiences did highlight the factors limiting system performance in these applications and indicated the next significant system enhancement. These performance-limiting factors are main (core) memory utilization and disk access time.
Several conventional remedies to this problem have been investigated and determined to be considerably less than optimum. First, the addition of a large central processing unit, such as PDP 11/70, to the system processor family with its two-million word memory capacity is a solution if the performance limits are construed as due solely to lack of memory capacity. The system provides several mechanisms whereby tradeoffs between added core memory and reduced disk access delays can be made. While these measures can be effective in reducing delays involving system task loading, they are of only limited value in large data base, multi-user systems. In such systems, large numbers of different files are always active, and the price of maintaining in cache memory the resulting large numbers of core resident-user directories, indices, and file headers employed by these techniques is prohibitive.
A second technique traditionally employed is the use of fixed, as well as moving, head disks in a hierarchy of progressively more rapid access but smaller capacity media. This technique approximately duplicates the performance achieved by the large PDP 11/70 mentioned above and suffers from the same limitations. Because of their small capacity (500K words), only a fraction of the concurrently active files could be stored in cache memory. Furthermore, the frequency of use of every block in the file, except for system files (tasks images, etc.), would not justify the processing overhead required for the initial transfer of the data from the main data base to the intermediate storage media. Such a technique would only be effective for providing preferential treatment to a very small set of files on a semi-permanent basis. Furthermore, overall access delays of 10 to 20 milliseconds would still be incurred. A variant of this technique utilizes bulk core replacements for the fixed head disk units. This has the advantage of much smaller access delays, but suffers from capacity limitations.
The solution provided by the present invention is functionally similar to the conventional solutions mentioned above, but with at least one significant difference. Although a limited capacity random access memory is used, its use is highly selective and based upon predictable demands for disk resident data and dynamically imposed priority criteria. The invention exploits certain file management system attributes to minimize the amount of fast access memory required to perform its function. The net result is that for each user transaction, only those blocks of data actually to be processed next are cached in fast access memory, and disk transactions are carried out in larger, more efficient multi-block transfers.