The present invention relates to a stacked processor construction and a method for producing it. More particularly, the present invention relates to a stacked processor construction and a method for producing the stacked processor construction wherein the local PCB bus is optimized to maximize the speed of the local PCB bus.
In printed circuit board (PCB) technology, local bus timing challenges have generally been confronted in relation to two-dimensional topologies. The challenges associated with bus length issues have typically been addressed by employing flat, two-dimensional multi-chip modules or by utilizing complex signaling conventions to overcome the speed limitations associated with signal propagation constraints. Generally, in a system where processors are relatively far apart, the system is generally slower due to the length of time required for the signal to travel between the processors via the trace connecting the pins of the processors.
One known solution to this problem is to move the processors closer together on the printed circuit board, thereby shortening the length of the local bus traces interconnecting the processors. Increasing the speed of the local bus enables execution speed to be increased, which increases the speed of the overall system. In order to increase execution speed, many factors must be addressed, including, for example, the timing of the signals sent between the processors and memory. The speed of the local PCB bus is related to the amount of time that is required for a processor to obtain a predetermined amount of data (i.e., a transaction) from memory and load it into the processor. This speed, in turn, relates to the speed of the bus clock and to the data transfer protocol being used. All of these factors are limited by the speed of the local PCB bus, which greatly depends on the trace lengths of the local PCB bus. Therefore, one of the primary factors effecting the overall system speed is the trace lengths of the local PCB bus.
Although moving the processors closer together on the local PCB bus enables the trace lengths of the local PCB bus to be shortened, which can improve the speed of the local PCB bus, two-dimensional topologies are limited with respect to how close to each other the processors can be placed on the PCB. Therefore, a need exists for a method for spatially locating processors in a multi-processor environment with respect to one another such that the trace lengths of the local PCB bus can be minimized, thereby enabling the speed of the PCB bus to be increased.
FIG. 1 illustrates a known multi-processor arrangement in which the processors are arranged on a PCB board in a two-dimensional topology. The arrangement shown in FIG. 1 comprises four central processing units (CPUs) 10, which are all connected to a local PCB bus 11. A core electronics components (CEC) 12 is connected to the local PCB bus 11 and provides an inner face between the CPUs 10 and memory (not shown). The CEC 12 also provides an input/output (I/O) interface for the CPUs 10 and any components (not shown) communicating with the CPUs 10 over the local PCB bus 11 via the CEC 12. Each of the CPUs 10 has particular pins that are connected to each other via the local PCB bus 11. Each of the pins is connected via IC package traces and other conductive bonding elements to the die of the respective CPU 10. Therefore, each pin of each CPU 10 has a particular IC package trace length associated with it.
In addition, the signal pins of each CPU 10 has a local PCB bus trace length associated with it. The term signal pin, as that term is used herein, is intended to generally denote pins that correspond to data and address signals. Some pins of the CPUs 10 are not connected to the local bus. Some pins that are connected to the local bus are not used for data or addresses. The PCB bus trace lengths associated with the signal pins corresponds to the distance between a signal pin of one CPU 10 along the local bus to a signal pin of another CPU 10. This combined IC package trace length and local PCB bus trace length, which will be referred to hereinafter as the die-to-die trace length, is related to the overall speed of the system. Shortening the die-to-die trace lengths can reduce the PCB bus length and thus improve the overall speed of the system. Therefore, in a two-dimensional topology such as that shown in FIG. 1, attempts have been made to route the PCB bus trace lengths in such a way that the die-to-die trace lengths are minimized for certain signals.
A need exists for a method that can be utilized to further reduce the die-to-die trace lengths between the CPUs in order to improve the speed of the PCB bus and of the overall system. Accordingly, a need exists for a method that enables the processors in a multi-processor environment to be located with respect to one another in such a way that die-to-die trace lengths associated with certain signals can be reduced in order to increase the speed of the local PCB bus and of the overall system.
The present invention provides an apparatus comprising a stacked processor construction and a method for creating the stacked processor construction. The stacked processor construction comprises two or more printed circuit boards (PCBs), each of which has at least one processor mounted thereon, and each of which has a local PCB bus therein. Each processor is electrically coupled to its respective local PCB bus. The PCBs are stacked substantially parallel to each other in such a way that the processors are not placed into contact with each other. The local PCB buses are electrically coupled together to enable the processors to communicate with each other.
In accordance with a first embodiment, the stacked processor construction apparatus comprises a first printed circuit board having a first local bus, which comprises of conductive traces, a first processor mounted on the first printed circuit board and electrically coupled to the first local bus, a second printed circuit board having a second local bus comprised of conductive traces, a second processor mounted on the second printed circuit board and electrically coupled to the second local bus, and a first stacking device connected to the first and second printed circuit boards. The first stacking device separates the first and second printed circuit boards a predetermined distance apart from one another and maintains the first and second printed circuit boards substantially in first and second planes, which are substantially parallel to one another. The predetermined distance is at least large enough to prevent the first processor from being in contact with the second processor. A group of conductive elements electrically couples the first local bus to the second local bus to enable the first and second processors to communicate with each other.
Preferably, the first stacking device is a high-speed, impedance-controlled connector that comprises the group of conductive elements that electrically couple the first and second local buses together. The stacking device may be, for example, a MICTOR(trademark) connector, developed by AMP, Inc. The first and second processors are oriented on the first and second printed circuit boards, respectively, in such a way that the die-to-die distance is optimized for certain signals being communicated between the first and second processors.
Preferably, the stacked processor architecture of the present invention further comprises third and fourth PCBs. The third PCB has a third local bus comprising conductive traces, a third processor mounted on the third printed circuit board and electrically coupled to the third local bus, a second stacking device connected to the second and third printed circuit boards and to the first stacking device, and a second group of conductive elements that electrically couple the first and second local buses to the third local bus to enable the first, second and third processors to communicate with each other.
The fourth PCB is connected to the second stacking device, which has a fourth local bus comprising conductive traces. A fourth processor is mounted on the fourth printed circuit board and is electrically coupled to the fourth local bus. The second group of conductive elements electrically couple the first, second and third local buses to the fourth local bus to enable the first, second, third and fourth processors to communicate with each other. Preferably, the second stacking device is also a high-speed, impedance-controlled connector, such as, for example, a Harman, John M (John) [johnharman@lucent.com]connector, and the second group of conductive elements are comprised by the connector.
In accordance with another embodiment of the present invention, the apparatus comprises two PCBs, each having a processor mounted on opposite sides thereof. Each processor is electrically coupled to the local bus of the PCB on which it is mounted. The PCBs are stacked on a stacking device such as the aforementioned stacking device and are maintained a predetermined distance apart so that the processors on different PCBs do not come into contact with each other.
Other features and advantages of the present invention will become apparent from the following description drawings and claims.