Sequential generations of computing systems typically require increasing degrees of performance and integration. A typical computing system includes a central processing unit (CPU), a graphics processing unit (GPU), a high-capacity memory subsystem, and set of interface subsystems. The set of interface subsystems may be configured to communicate with other devices, including devices that provide user interaction, devices that provide physical measurement, and devices that provide connectivity to storage systems and other computing systems.
Conventional computing systems typically achieve higher degrees of performance and integration by implementing an increasing number of processing cores on a single die or “chip.” Additional cache memory may also be added to each processing core and as a resource shared by multiple processing cores. Measures of die area for multi-core devices have increased over time, as more CPU cores, GPU cores, on-chip cache memory, and additional interface blocks are integrated into a single processor chip. One advantage of integrating multiple processing cores and other subsystems onto a single die is that high-performance may be achieved by scaling conventional design techniques and leveraging advances in fabrication technology that enable greater circuit density.
However, one disadvantage of simply integrating more processing cores onto a single chip is that manufacturing cost for the chip typically increases disproportionately with respect to die area, increasing marginal cost associated with each additional processor core. More specifically, manufacturing cost for a given chip is typically a strong function of die area for the chip. In many cases, die area associated with highly-integrated multi-core processors is well above a characteristic cost knee, leading to disproportionate cost inefficiencies associated with multi-core processors. Alternatively, a computing system may be build from a plurality of independently packaged processing devices; however conventional chip-to-chip signaling techniques do not efficiently support multiprocessing performance targets commonly associated with highly-integrated multi-core devices.
Thus, there is a need for improving signaling and/or other issues associated with the prior art.