The present invention relates to communications and computer systems, and more particularly to system architecture which provides a broad spectrum of services simultaneously to a large number of devices and users in an efficient and rapid manner.
There are several well known approaches to the interconnection of a plurality of multi-processors. A short explanation of these known approaches are described in the following paragraphs.
The most current and widely proposed system is of the "Bussed multi-processor" type. As the name applies, this system comprises a number of processors on a bus(ses). There is a contention for bus communications, as the number of processors is increased, the bus data band width may become the limiting factor. A global memory is commonly employed in this architecture. Increasing the number of processors can also limit memory access. There is a lack of flexibility for system expansion. Quite often the bus(ses) data range and protocol complexity hinder performance. Pooled processors is terminology frequently applied.
Another system is referred to as "Pipeline Processing". This is a means of increasing throughput at a processing site without adding more bus accesses. A processing task broken into concurrent sequential units to speed up performance. Each processor performs its tasks and passes it to the next processor in the chain. This could also be implemented on a bus system, however, it could result in bus saturation. Pipeline processing is the basis for vector processing.
Another system is referred to as the "Transputers". Transputers is a processor node with its own local memory and four external serial communication links to other transputers or input/output links. With four external communication links it is readily configured into a pipeline or an array system. Since the number of communication channels and memory increases with the number of processors, it will not become communications or memory bound. For non-array or non-pipeline processing, one of the short comings of the transputer implementation is that for data transfer between two separate nodes, the connecting nodes become involved in communications handling, thereby limiting the connection nodes processing efficiency. A special programming language OCCAM has been developed for transputer concurrent processing.
Another system for processing is referred to as the "Systolic Processing" Systolic processing is performed with a two dimensional array of processors. The transputer implementation is a special case systolic processor. The systolic processor is well suited for array type of problems such as data filtering and image processing. A systolic processor node receives data from its upper and left neighbors and passes results on to it's lower and right neighbors. For nonsymmetrical operations, the processing nodes become overly involved in communications.
Still another system for processing is referred to as the "Hypercube". The hypercube is an extension of the systolic processing to more dimensions. Systolic processing is a two dimensional special case of the hypercube. The number of communication channels per hypercube depends on the hypercube dimension. Communication capability and the number of nodes increases for higher dimension hypercubes. The hypercube gives more flexibility for various problems than the systolic processor. The hypercube has the same problems as the systolic processor with wasted pass through nodes for non-compliant geometric processing solutions.
Still another system is referred to as "Stargraph Processing". Stargraph is like hypercube network except it uses less link per node. A stargraph of order four has three links per processor node and provides twenty four node processors with 72 links whereas a hypercube of order four has four links per processor node and has sixteen node processors with 64 links. A stargraph of order five provides 120 node processor with 480 links whereas a hypercube of order five has thirty two node processors with 160 links. A stargraph of order six provides 720 node processors with 3600 links whereas a hypercube of order six has sixty four node processors with 384 links.
Yet another system is referred to as the "Banyan Tree Processing". A Banyan tree network has two input and two output links plus a processor link at each node. Various versions exist such as the Butterfly network which is formed like a cylinder with processors along the input seam and memories along the output seam; Also the Torusnet which has the network in the form of a doughnut with processors at every node.
The processing approach of the present invention overcomes the disadvantages of other processing systems and adds additional desirable features.