1. Field of the Invention
The present invention generally relates to high performance computer systems and, more particularly, to a machine design which combines superior system organization and packaging structure for maximizing performance.
2. Description of the Prior Art
Increasing demand for computer power has outstripped the capability of single processors (uniprocessors) to perform. High performance computers now require many processors operating in parallel and sharing the same main memory; i.e., so-called tightly coupled parallel processors. In addition, numerically intensive computing applications are growing, placing a requirement for vector processing capability at very high speeds.
An example of a tightly coupled multi-processor system is the IBM System/390 9000 series family of computers. The basic organization of a tightly coupled multi-processor (MP) system comprises a plurality of processors which may be selectively connected to a plurality of independently addressable memory modules known as basic storage modules (BSMs). In a typical MP system, there may be N processors and M BSMs, where M is typically greater than N. Since all processors require equal access to the BSMs, there is some form of N.times.M switch, such as a cross-bar switch, which selectively connects a processor to an addressed BSM for storing and retrieval of data.
The parameters of importance to the performance of the MP system are processor cycle time, bandwidth, electrical path length, round trip delay, and timing skew. The processor cycle time is minimized by placing the cycle determining path elements in the closest possible proximity to each other. The bandwidth between a processor and a BSM is maximized by using the fastest possible data rate over a large number of parallel connections between the processor and the switch and between the switch and the BSMs. The electrical path length is the length between data latching points on different, but interconnected, functional units as measured in nanoseconds. The total round trip delay from a processor to an addressed BSM and back is known as the memory latency. This includes a number of electrical path lengths. The skew is the electrical path length differences due to variations in routing from one point to another. The area of memory is determined by the surface area required to contain the storage chips and the logic support chips.
In a known construction, referred to as "card-on-board" (COB) memory, all of the external interconnections are placed on one edge of the card. When the memory is accessed for data, a signal must travel from the input edge of the card to the far side and return back to the original edge. In so doing, it has traversed the width of the card twice, with attendant delay, and the required data appears at the same edge from which it was requested and therefore, no closer to its final destination. It is evident in this conventional system, there is significant skew or difference in electrical path due to accessing different parts of the memory or different memory chips in different sections of the memory, or from different processors.