The present invention relates generally to interconnection architecture, and particularly to interconnecting multiple processors with multiple shared memories.
Advances in the area of computer graphics algorithms have led to the ability to create realistic and complex images, scenes and films using sophisticated techniques such as ray tracing and rendering. However, many complex calculations must be executed when creating realistic or complex images. Some images may take days to compute even when using a computer with a fast processor and large memory banks. Multiple processor systems have been developed in an effort to speed up the generation of complex and realistic images. Because graphics calculations tend to be memory intensive applications, some multiple processor graphics systems are outfitted with multiple, shared memory banks. Ideally, a multiple processor, multiple memory bank system would have full, fast interconnection between the memory banks and processors. For systems with a limited number of processors and memory banks, a crossbar switch is an excellent choice for providing fast, full interconnection without introducing bottlenecks.
However, conventional crossbar-based architectures do not scale well for a graphics system with a large number of processors. Typically, the size of a crossbar switch is limited by processing and/or packaging technology constraints such as the maximum number of pins per chip.
In general, in one aspect, the invention features a method and apparatus. It includes a plurality of processor groups each having a plurality of processor switch chips each having a plurality of processors and a processor crossbar, each processor connected to the processor having a plurality of switch crossbars each connected to a processor crossbar in each processor group, wherein no two switch crossbars in a switch group are connected to the same processor crossbar; a plurality of memory groups each having a plurality of memory switch chips each having a plurality of memory controllers and a memory crossbar, each memory controller connected to the memory crossbar, each memory crossbar in each memory group connected to all of the switch crossbars in a corresponding one of the switch groups, wherein no two memory groups are connected to the same switch group; and a plurality of memory chips each having a plurality of memory tracks each having a plurality of shared memory banks, each memory track connected to a different one of the memory controllers.
In general, in one aspect, the invention features a method and apparatus for use in a scalable graphics system. It includes a processor switch chip having a plurality of processors each connected to a processor crossbar, and a memory switch chip having a plurality of memory controllers each connected to a memory crossbar and controlling a shared memory bank; and wherein the memory crossbar is connected to the processor crossbar.
Particular implementations can include one or more of the following features.
Implementations include an intermediate switch chip having a switch crossbar, the switch crossbar connected between the processor crossbar and the memory crossbar. Each memory controller is connected to a memory chip having a shared memory bank. The memory switch chip includes a memory bank connected to the memory controller. The apparatus is used for the purposes of ray-tracing.
Advantages that can be seen in implementations of the invention include one or more of the following. Implementations enable low latency memory and processor scalability in graphics systems such as ray-tracing or rendering farms with currently available packaging and interconnect technology.
The details of one or more embodiments of the invention are set forth in the accompanying drawings and the description below. Other features, objects, and advantages of the invention will be apparent from the description and drawings, and from the claims.