The present invention relates to large system computer architecture and, more particularly, to interfacing elements employed between devices in large computer systems for the transfer of data such as queues and crossbars.
Basic digital computing operates in the cycles shown on FIG. 1. In what is referred to as a Von Neuman computer a sequence of instructions are alternately fetched from their location in memory into an execution register and executed. The computational sequences begin by providing the computer with a first instruction address for execution. Thereafter, the address of the next instruction is automatically calculated byy the hardware as a result of the previous execution.
A simple Von Neuman system is shown in FIG. 2. The arithmetic and control unit 10 sequentially fetches and executes a series of instructions located in instructional memory 12. Typically, the instructions from memory 12 cause the arithmetic and control unit 10 to operate on data read in through input lines and/or maintained in data memory 14. The results of the calculations can be output to be seen by an operator on a device such as typewriter 16.
Certain computational projects of large magnitude have traditionally imposed severe time constraints on computational ability. Things such as major conversion problems and military command and control systems requiring repetitive, rapid, and immediate results fall into this category. To solve such problems, the Von Neuman type computer has been made bigger and bigger and faster and faster. Recently, much thought has been given to deviating from this approach and having the data drive the problem rather than the problem drive the data. Much of the impetus for research in this area has been as a result of recent developments in hardware wherein microprocessors and "smart" chips have been made available in smaller and smaller sizes and at lower and lower costs. Such an approach is shown in its simplest form in FIG. 3. In such a system, the driving data elements are referred to as tokens. The tokens cause the "firing" of an activity which generates results depending upon the value of the tokens. Assume that an activity is implemented by a separate chip, and that chip 18 has two inputs 20 and 22 and an output 24. Assume further that chip 18 in the presence of both inputs 20, 22 will produce their sum at the output 24. The activity of chip 18 is simply that of the summing of the inputs.
Such a series of activities can be hard-implemented as "activity chips" or can be treated as logical entities emulated by more general devices, such as microprocessors 26 which are shown interconnected in FIG. 4 in what is referred to as a Petri net. Each of the activities 26 performs a function (indicated as f.sub.1 -f.sub.5 respectively) on the input(s). Such Petri nets are pre-established and do not change during the course of a computation. Each of the activities 26 is independent and, therefore, the sequence of computations is a function of the arrival of the tokens. When tokens A and B are present and available for function f.sub.1, its output is available as a input to functions f.sub.2 and f.sub.4. Token C in the presence of the output of function f.sub.1 will cause function f.sub.2 to be calculated providing the second input for function f.sub.4 which, in turn, provides one of the two inputs required for function f.sub.5. The presence of tokens D and E cause the output of function of f.sub.3 to be made available as the second input for function f.sub.5.
Turning now to FIG. 5, a simplified drawing of a computer system of the Von Neuman type as used in real-time on-line systems is shown. The main memory 30 as accessed by the arithmetic and control unit (not shown) contains resident system programs 32, resident sub-routines 34 (usually re-entrant i.e., more than one program can be using them at a time), and a large section of available memory 36 wherein programs can be loaded for temporary execution. The majority of the programs and data are maintained on a mass storage device 38. When, as a result of an appropriate stimulus such as an interrupt, data arrival, or the like, the resident system program 32 requires the execution of a program on the mass storage device 38, that program is transferred from the mass storage device 38 into an unoccupied portion of available execution memory 36 as symbolized by the dotted area 40. When the program has been transferred into area 40, control is transferred to the first instruction and execution proceeds normally. During the execution procedure, the program within dotted area 40 can employ one or more of the sub-routines 34. When the program has completed its operation, dotted area 40 is merely returned to available status. That is, programs are not transferred back to mass storage 38. In the preferred mode of operation of such systems, the programs on mass storage device 38 are maintained in "dump" format and coded according to "run anywhere" techniques. Thus, a simple block transfer from mass storage 38 into execution memory 36 can be accomplished and the program will run wherever loaded. Moreover, the program can be located and operated simultaneously in more than one available area within memory 36.
A more exotic version of such a system is shown in simplified form in FIG. 6. Such a multi-processor system is typically found in military command and control systems wherein such a large quantity of functions must be accomplished that no single processor (e.g. computer) can accomplish the entire task. In such a system, one or more mass storage devices 38 containing programs and data are interfaced with a communication bus 42. The processors 44 are also interfaced with the communication buss 42. In such manner, the processors 44 can communicate between one another along the buss 42 as well as accessing the mass storage 38. When operating in a task oriented mode, a common task list or queue is maintained and tasks are assigned to the next available processor 44 in response to an initiating stimulus.
Communication crossbar circuits and fast propagating first in, first out (FIFO) queues have been postulated for use in such large systems in order to eliminate potential bottlenecks. That is, when one entity needs to communicate with another, a bottleneck can occur if a communication path between the two cannot be established immediately. In such case, a communication crossbar circuit is proposed. In like manner, when a large number of items are placed on a queue having many stages, a potential bottleneck can occur if the next piece of available data is many stages away from the requester and a considerable period of time and clock pulses must be occupied in order to shift the data to the user.
The proposed devices to date either exist on paper only or, if built, only operate as a stand-alone device. This is because of the inability of the designer of such apparatus to recognize the limitations which must be included within such apparatus when working in the given environment. For example, a common technique is to make such devices operate asynchronously to the system. The patents to Faustini (U.S. Pat. No. 3,757,231), Mallerich, Jr. (U.S. Pat. No. 3,736,575), and Derickson, III, et al (U.S. Pat. No. 3,972,034) are examples of such apparatus. Faustini describes an asynchronous unclocked FIFO which shifts stage by stage. Mallerich describes an improvement over Faustini which operates in the same manner, but with fewer components. The Derickson patent includes the description of a FIFO as part thereof beginning with the description in column 3. That FIFO also is asynchronous.
In a single computer of the Von Neuman type, the problem cannot occur because any word in memory will always be completely accessed. That is, a word in memory will always be written into completely or read completely. In an interrupt-operated system having different priority levels, recognition of the potential problem being discussed herein is recognized by the technique of "re-entrant" coding which must be employed whenever common data can be accessed by multiple programming levels. Still in all, even in that case, reading and writing are on a complete word basis.
By contrast, if a common memory location is being accessed asynchronously, it is possible for one device to be accessing a particular memory location as, for example, by reading from it simultaneously with another device reading into it. Such an occurence can result in completely garbled and unintelligible data. The reason that this is not normally considered or accounted for is that, typically, it only happens on rare occasions. Thus, the designer running a short test case probably does not have the event occur or, if it does, it is not recognized. As a practical matter, however, it is something which does occur and must be accounted for.
Thus, any crossbar or FIFO operating within a computer architecture as to be herein described must be synchronized to the clock driving the system so that appropriate preventive measures can be incorporated to prevent such potentially catastrophic events from occurring.
It is the object of the present invention to provide a data driven computer architecture having the benefits of data driven initiation with the flexibility of more traditional Von Neuman computers.
It is a further object of the present invention to provide such an architecture with interfacing mechanisms which maximize the flow rate of data between elements and minimizes the time delays inherent therein.