1. Field of the Invention
The present invention relates to an interconnection network for a cluster-based parallel processing computer and more particularly to a hierarchical crossbar inter-connection network for a cluster-based parallel processing computer used to efficiently provide data transmission paths and connecting so as to enable the processing nodes of the network to receive/send data at a high-speed.
2. Description of the Prior Art
The interconnection network of the parallel processing computer is one of the major elements for determining the architecture and performance of a parallel processing computer. Each conventional parallel processing system has a unique architecture that is suitable for an application that it is used for. The most important element for determining these architectural characteristics is a connection architecture. That is, an interconnection network used to connect the processors on a system is the most important.
Most parallel processing systems are configured in a hierarchical structure and are generally configured in two or three hierarchical structures.
The lowest hierarchy is configured with an uniprocessor node or SMP node (symmetric multiprocessing node). An SMP node has one connection structure and accordingly it is regarded as one sub-system capable of carrying out an independent execution.
Uniprocessor nodes or SMP nodes of the above are connected to form a cluster. Generally, a cluster is capable of carrying out independent executions.
According to conventional technology, an interconnection network is largely classified into an indirect interconnection network and direct interconnection network.
The indirect interconnection network of the former normally has a multi-stage connection structure; it is a interconnection network capable of setting up paths in the switches of connection networks and has basically deadlock-free characteristics. However, it is a blocking network where all the paths cannot be set up at the same time. Accordingly, to overcome the above weakness, the research on the topology of an interconnection network used to obtain high performance having a low cost, efficient routine scheme, and the capability to set up the paths of all the combinations and tolerate a single fault is being carried out vigorously.
However, not all of the indirect inter-connection networks can be easily expanded due to their non-hierarchical structures and accordingly are not suitable to the cluster-based parallel processing systems, having hierarchical structures.
Therefore, Howard Thomas Olnowich successfully developed a Multi-Function Network for providing redundant paths through the use of an 8.times.8 crossbar switch in order to overcome blocking problems (Howard Thomas Olnowich,et al., Multi-Function Network, European Publication, No. 0505782 A2, Sep. 30, 1992).
However, this Multi-Function Network was not suitable for a cluster-based parallel processing system having a hierarchical structure.
In addition, Non-blocking Multicast Switching System (Jonathan S. Turner, Non-blocking Multicast Switching System, U.S. Pat. No. 5,179,551, Jan. 12, 1993) and Packet-Switched Intercommunication Network (Napolitan L M, Packet-switched intercommunication network for distributed memory, parallel processor computers), successfully developed by Napolitano L M are both multistage inter-connection networks that are characterized by the non-blocking capability, low data delay latency and fault tolerance capability.
However, they are found to not be suited for cluster based parallel processing systems having a hierarchical structure.
On the other hand, the direct interconnection network of the latter had a point-to-point connection between each node and routing control is carried out by router of each node (processor).
This direct interconnection network is basically a deadlock connection network and the routing algorithm is used to ensure deadlock-free capability. Moreover, since it has expendability unlike the above indirect interconnection network, it has been used widely for most commercial parallel processing systems. However, the direct inter-connection network has a relatively longer data delay time between nodes and the number of links per node is relatively more in comparison with the indirect interconnection network.
Accordingly, for these types of direct interconnection networks, the research on topology that can connect many nodes while having a fewer number of links per node, short delay time between nodes, and a minimum configuration links was carried out.
Unlike the above, the High Performance Computer System developed by Stephen R. Colley is a system having a low data delay time and connection expendability in comparison with other direct inter-connection network using a hyper-cube connection structure (R. Colley, et al., High Performance Computer System, U.S. Pat. No. 5,113,523, May 12, 1992.).
However, the High Performance Computer System developed by Stephen R. Colley is characterized by the fact that the number of links per node is increased as the number of system nodes is increased and accordingly, the system can not be easily expanded.
To solve the above problem, Birrell, A D developed the High-Speed Mesh Connected Network that ensures easy expendability of the system.
The above invention is characterized in that the system can be easily expanded without extra cost with each node having the same number of links (Birrell, A D, High-Speed Mesh Connected Local Area Network, U.S. Pat. No. 5,088,091.).
Recently, parallel processing systems characterized by flexible expendability with mesh connection structures have been commercialized successfully.
However, these types of mesh connection structures provide excellent connection expendability but in a large-sized system their data transmission delay latency between nodes are found to be too long. Data transmission delay time such as the above deteriorates the performance of entire system and accordingly Torus-mesh and 3-D mesh connection structures have been studied and researched. Torus-mesh and 3-D mesh connection structure such as the above, have been able to shorten delay time considerably relative to the existing mesh connection structure. However, its weakness remains nonetheless due to the characteristics of the basic connection structure.
To overcome the characteristics of direct/indirect interconnection networks as described in the above, Robert C. Zak successfully developed an inter-connection network of the intermediary type (Robert C. Zak, et al., Parallel Computer System including arrangement for transferring messages from a source processor to selected ones of a plurality of destination processors and combining responses, U.S. Pat. No. 5,265,207, Nov. 23, 1993.). The interconnection network of the Parallel Computer System developed by Robert C. Zak is capable of providing excellent expendability with a fat-tree connection structure. Each cluster is made up of four nodes and 8.times.8 cross bar switch devices. In addition, it can tolerate a single fault by providing a dual link and has a fat-tree connection structure characterized by a same transmission bandwidth of all links. However, the interconnection network requires many hierarchies when configuring a large-cited system and accordingly, delay latency is increased to as many as the number of crossbar switches that are passed during data transmission. A great length of time is required in order to send data by the packet unit since a basic data transmission width is narrow. This time delay refers to the time from the instance of transmit part sending data to the instance of receiving data by the receive part. Accordingly, data to be sent from the transmit part experiences a little delay time when they pass through a small number of crossbar switches. Therefore, system performance improves as delay time decreases. At this time, operating frequency, protocol, and network topology are determining factors for reducing delay time. For example, the above hierarchical interconnect network requires a long delay time since data are sent via three crossbar switches when two lower clusters are formed.