1. Field of the Invention
The present invention relates to massively parallel processors and in particular to a system and method for handling input and output in a massively parallel processor.
2. Background Information
Massively parallel processing (MPP) systems are computing systems comprised of hundreds or thousands of processing elements (PEs) individually interconnected by a common high-speed communication network. MPPs can be classified as either multicomputers or as multiprocessors. In a multicomputer MPP each PE is considered a stand-alone computer with its own central processor, local memory, and associated control logic. Each PE can only address its own local memory. It cannot directly read or write the local memory associated with another PE but instead must read data from another PE's memory by sending a message in an I/O-like packet to the target PE requesting that some data from its memory be formatted and sent back to the requesting PE, or vice versa for writes. Thus in a multicomputing system, each remote reference is essentially an I/O operation involving the target PE. This style of interprocessor communications is called "message passing." Message passing is a well-known and prevalent MPP programming model because multicomputers are relatively easy to build. The ease of construction of a multicomputer MPP arises from the use of commodity microprocessors in an environment that closely resembles their "natural habitat" (i.e., that hardware and software implementation envisioned by the microprocessor designers), that is, a network of small autonomous computers.
In a multiprocessor MPP, on the other hand, every PE can directly address all of memory, including the memory of another (remote) PE, without involving the processor at that PE. Instead of treating PE-to-remote-memory communications as an I/O operation, reads or writes to another PE's memory are accomplished in the same manner as reads or writes to the local memory. Therefore, multiprocessors have an ease-of-programming advantage over multicomputers.
A massively parallel processing system having attributes of both a multiprocessing and a multicomputing MPP is described in MULTIDIMENSIONAL INTERCONNECTION AND ROUTING NETWORK FOR AN MPP COMPUTER, U.S. Pat. No. 5,583,990 issued Dec. 10, 1996 by Birrittella, et al. That MPP system relies on a block transfer engine to perform data transfers without interrupting the local processor of the memory being read or written. In addition, prefetch message queues are used to prefetch data from remote locations whenever possible.
Such an approach addresses the problem of efficient transfer of data within the MPP system but does not extend these same efficiencies to the problem of communication between the MPP system and outside devices. The rate at which data can be transferred into and out of an MPP system is critical to the efficient use of the system. If communication between the outside world and the MPP system is too slow the MPP will only be useful in solving large scale problems (where the cost of loading the problem is dwarfed by the efficiencies of running on the MPP). In the MPP system described by Birrittella, et al., input and output from the MPP is handled through I/O gateways which transfer system data and control information between the host system and the MPP system. Like the regular processing nodes, each gateway can be used to transfer information to and from any processing element in the interconnect network. The I/O gateways are not, however, part of the toroidal mesh interconnect network. Instead they are attached as an appendage to processing nodes in two of the three interconnect dimensions.
Such an MPP system therefore has limited pathways from the outside world to the processing nodes of the MPP system. These limited pathways serve as a communications bottleneck which can throttle performance of the MPP system. What is needed is a system and method of transferring information into and out of an MPP system which overcomes this potential bottleneck.